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PREFACE TO THE 
THIRD EDITION 


The many additions and revisions in this third edition of Mathematical 
Methods for Physicists are based on 1 5 years of teaching from the second edition, 
on the questions from current students, and on the advice of colleagues, reviewers, 
and former students. Almost every section has been revised; many of the sections 
have been completely rewritten. In most sections, there are new exercises, all class 
tested. New sections have been added on non-Cartesian tensors, dispersion 
theory, first-order differential equations, numerical application of Chebyshev 
polynomials, the fast Fourier transform, and on transfer functions. 

Throughout the text, I have placed significant additional emphasis on numer- 
ical applications and on the relation of these mathematical methods to comput- 
ing and to numerical analysis. 

For students studying graduate level physics, particularly theoretical physics, 
a number of topics including Hermitian operators, Hilbert space, and the concept 
of completeness have been expanded. 


xiii 




PREFACE TO THE 
SECOND EDITION 


This second edition of Mathematical Methods for Physicists incorporates a 
number of changes, additions, and improvements made on the basis of experience 
with the first edition and the helpful suggestions of a number of people. Major 
revisions have been made in the sections on complex variables, Dirac delta func- 
tion, and Green’s functions. New sections have been included on oblique co- 
ordinates, Fourier-Bessel series, and angular momentum ladder operators. The 
major addition is a series of sections on group theory. While these could have 
been presented as a separate group theory chapter, there seemed to be several 
advantages to include them in Chapter 4, Matrices. Since the group theory is 
developed in terms of matrices the arrangement seems a reasonable one. 
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PREFACE TO THE 
FIRST EDITION 


Mathematical Methods for Physicists is based upon two courses in mathematics 
for physicists given by the author over the past fourteen years, one at the junior 
level and one at the beginning graduate level. This book is intended to provide 
the student with the mathematics he needs for advanced undergraduate and 
beginning graduate study in physical science and to develop a strong background 
for those who will continue into the mathematics of advanced theoretical physics. 
A mastery of calculus and a willingness to build on this mathematical foundation 
are assumed. 

This text has been organized with two basic principles in view. First, it has been 
written in a form that it is hoped will encourage independent study. There are 
frequent cross references but no fixed, rigid page-by-page or chapter-by-chapter 
sequence is demanded. 

The reader will see that mathematics as a language is beautiful and elegant. 
Unfortunately, elegance all too often means elegance for the expert and obscurity 
for the beginner. While still attempting to point out the intrinsic beauty of mathe- 
matics, elegance has occasionally been reluctantly but deliberately sacrificed in 
the hope of achieving greater flexibility and greater clarity for the student. 

Mathematical rigor has been treated in a similar spirit. It is not stressed to the 
point of becoming a mental block to the use of mathematics. Limitations are 
explained, however, and warnings given against blind, uncomprehending appli- 
cation of mathematical relations. 

The second basic principle has been to emphasize and re-emphasize physical 
examples in the text and in the exercises to help motivate the student, to illustrate 
the relevance of mathematics to his science and engineering. 

This principle has also played a decisive role in the selection and development 
of material. The subject of differential equations, for example, is no longer a 
series of trick solutions of abstract, relatively meaningless puzzles but the solu- 
tions and general properties of the differential equations the student will most 
frequently encounter in a description of our real physical world. 
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INTRODUCTION 


Many of the physical examples used to illustrate the applications of mathemat- 
ics are taken from the fields of electromagnetic theory and quantum mechanics. 
For convenience the main equations are listed below and the symbols identified. 
References in these fields are also given. 


ELECTROMAGNETIC THEORY 


MAXWELL'S EQUATIONS (MKS UNITS— VACUUM) 
VD = /> V x E= 


V *B = 0 V x H = — — h J 

8t 

Here E is the electric field defined in terms of force on a static charge and B the 
magnetic induction defined in terms of force on a moving charge. The related 
fields D and H are given (in vacuum) by 

D = e 0 E and B = jU 0 H 

The quantity p represents free charge density while J is the corresponding 
current. The electric field E and the magnetic induction B are often expressed in 
terms of the scalar potential cp and the magnetic vector potential A. 

E = — -V<p B-VxA 
8t 


For additional details see: J. M. Marion, Classical Electromagnetic Radiation , 
New York: Academic Press (1965); J. D. Jackson, Classical Electrodynamics , 2nd 
ed. New York: Wiley (1975). 

Note that Marion and Jackson prefer Gaussian units. A glance at the last two 

xxi 
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texts and the great demands they make upon the student’s mathematical com- 
petence should provide considerable motivation for the study of this book. 


QUANTUM MECHANICS 

SCHRODINGER WAVE EQUATION (TIME INDEPENDENT) 

ft 2 

— — vV + = E\jj 

2m Y Y Y 

\jj is the (unknown) wave function. The potential energy, often a function of 
position, is denoted by V while E is the total energy of the system. The mass of the 
particle being described by ijjism.h is Planck’s constant h divided by 2 n. Among 
the extremely large number of beginning or intermediate texts we might note: A. 
Messiah, Quantum Mechanics (2 vols). New York; Wiley (1961): R. H. Dicke and J. 
P. Wittke, Introduction to Quantum Mechanics , Reading Mass.: Addison-Wesley 
(1960); E. Merzbacher, Quantum Mechanics , 2nd Ed. New York: Wiley (1970). 



1 VECTOR 
ANALYSIS 


1.1 DEFINITIONS, ELEMENTARY APPROACH 

In science and engineering we frequently encounter quantities that have 
magnitude and magnitude only: mass, time, and temperature. These we label 
scalar quantities. In contrast, many interesting physical quantities have mag- 
nitude and, in addition, an associated direction. This second group includes 
displacement, velocity, acceleration, force, momentum, and angular momen- 
tum. Quantities with magnitude and direction are labeled vector quantities. 
Usually, in elementary treatments, a vector is defined as a quantity having 
magnitude and direction. To distinguish vectors from scalars, we identify vector 
quantities with boldface type, that is, V. 

As an historical sidelight, it is interesting to note that the vector quantities 
listed are all taken from mechanics but that vector analysis was not used in the 
development of mechanics and, indeed, had not been created. The need for 
vector analysis became apparent only with the development of Maxwell’s 
electromagnetic theory and in appreciation of the inherent vector nature of 
quantities such as the electric field and magnetic field. 

Our vector may be conveniently represented by an arrow with length propor- 
tional to the magnitude. The direction of the arrow gives the direction of the 
vector, the positive sense of direction being indicated by the point. In this 
representation vector addition 

C = A + B (1.1) 

consists in placing the rear end of vector B at the point of vector A. Vector C 
is then represented by an arrow drawn from the rear of A to the point of B. 
This procedure, the triangle law of addition, assigns meaning to Eq. 1.1 and is 
illustrated in Fig. 1.1. By completing the parallelogram, we see that 

C = A + B = B + A, (1.2) 

B 


FIG. 1.1 Triangle law of vector 
addition 

as shown in Fig. 1.2. In words, vector addition is commutative. 

1 
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B 



FIG. 1.2 Parallelogram law of vec- 
tor addition 


For the sum of three vectors 

D = A -j- B ~h C, 

Fig. 1.3, we may first add A and B 

A + B = E. 

Then this sum is added to C 


D = E + C. 

Similarly, we may first add B and C 

B + C = F. 

Then 

D - A + F. 

In terms of the original expression, 

(A + B) -f- C = A -f- (B 4- C). 
Vector addition is associative. 



A direct physical example of the parallelogram addition law is provided by 
a weight suspended by two cords. If the junction point (0 in Fig. 1.4) is in 
equilibrium, the vector sum of the two forces F t and F 2 must just cancel the 
downward force of gravity, F 3 . Here the parallelogram addition law is subject 
to immediate experimental verification. 1 


1 Strictly speaking the parallelogram addition was introduced as a definition. 
Experiments show that if we assume that the forces are vector quantities and 
we combine them by parallelogram addition the equilibrium condition of 
zero resultant force is satisfied. 
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FIG. 1.4 Equilibrium of forces. Fj_ -h F 2 = — F 3 


Subtraction may be handled by defining the negative of a vector as a vector 
of the same magnitude but with reversed direction. Then 

A-B = A + (-B). 

In Fig. 1.3 

A = E - B. 

Note that the vectors are treated as geometrical objects that are independent 
of any coordinate system. Indeed, we have not yet introduced a coordinate 
system. This concept of independence of a preferred coordinate system is 
developed in considerable detail in the next section. 

The representation of vector A by an arrow suggests a second possibility. 
Arrow A (Fig. 1 .5), starting from the origin, 2 terminates at the point (x x , y \ , z x ). 
Thus, if we agree that the vector is to start at the origin, the positive end may 
be specified by giving the cartesian coordinates (x l9 y l9 z 1 ) of the arrow head. 

Although A could have represented any vector quantity (momentum, 
electric field, etc.,), one particularly important vector quantity, the displacement 
from the origin to the point (x x , y i , z x ), is denoted by the special symbol r. We 
then have a choice of referring to the displacement as either the vector r or the 
collection (x l9 y l9 z x ) 9 the coordinates of its end point. 

*++(x l9 y X9 z x ). (1.3) 


2 The reader will see that we could start from any point in our cartesian 
reference frame, we choose the origin for simplicity. 
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z 



Using r for the magnitude of vector r, we find that Fig. 1.6 shows that the end- 
point coordinates and the magnitude are related by 

x l = rcos<x, yi — r cos/?, z x = r cosy. (1.4) 

Cos a, cos p, and cos y are called the direction cosines , a being the angle between 
the given vector and the positive x-axis, and so on. One further bit of vocabulary: 
The quantities x l5 y l9 and z x are known as the (cartesian) components of r or 
the projections of r. 


z 



FIG. 1 .6 Direction cosines 
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If we proceed in the same manner, any vector A may be resolved into its 
components (or projected onto the coordinate axes) to yield 

A x = Acosa, (1.5) 

in which a is the angle between A and the positive x-axis. Again, we may choose 
to refer to the vector as a single quantity A or to its components (A x ,A y ,A z ). 
Note that the subscript x in A x denotes the x component and not a dependence 
on the variable x, A x may be a function of x, y, and z as A x (x,y,z). The choice 
between using A or its components (A x ,A y ,A z ) is essentially a choice between 
a geometric or an algebraic representation. In the language of group theory 
(Chapter 4), the two representations are isomorphic. 

Use either representation at your convenience. The geometric “arrow in. 
space” may aid in visualization. The algebraic set of components is usually 
much more suitable for precise numerical or algebraic calculations. 

Vectors enter physics in two distinct forms. (1) Vector A may represent a 
single force acting at a single point. The force of gravity acting at the center of 
gravity illustrates this form. (2) Vector A may be defined over some extended 
region; that is, A and its components may be functions of position: A x = 
A x (x,y,z), and so on. Examples of this sort include the velocity of a fluid 
varying from point to point over a given volume and electric and magnetic 
fields. Some writers distinguish these two cases by referring to the vector 
defined over a region as a vector field. The concept of the vector defined over a 
region and being a function of position will be extremely important in Section 
1 .2 and in later sections where we differentiate and integrate vectors. 

At this stage it is convenient to introduce unit vectors along each of the 
coordinate axes. Let i be a vector of unit magnitude pointing in the positive 
x-direction, j, a vector of unit magnitude in the positive y-direction, and k, a 
vector of unit magnitude in the positive z-direction. Then iA x is a vector with 
magnitude equal to A x and in the positive x-direction. By vector addition 

A = \A X + ]A y + kA z , (1.6) 

which states that a vector equals the vector sum of its components. Note that if 
A vanishes, all of its components must vanish individually; that is, if 

A = 0, then A x = A y = A z = 0. 

Finally, by the Pythagorean theorem, the magnitude of vector A is 

A = (A 2 X + A 2 y + A 2 z ) l > 2 . (1.7a) 

This resolution of a vector into its components can be carried out in a variety 
of coordinate systems, as shown in Chapter 2. Here we restrict ourselves to 
cartesian coordinates. 

Equation 1.6 is actually an assertion that the three unit vectors i, j, and k 
span our real three-dimensional space : Any constant vector may be written as 
a linear combination of i, j, and k. Since i, j, and k are linearly independent 
(no one is a linear combination of the other two), they form a basis for the real 
three-dimensional space. 
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As a replacement of the graphical technique, addition and subtraction of 
vectors may now be carried out in terms of their components. For A — iA x + 
j Ay + k A z and B = \B X + j B y + k B z , 

A ± B = i (A x ± B x ) + j (A y ± B y ) + k (A z ± B z ). (1.76) 

EXAMPLE 1.1.1 


Let 


Then by Eq. 1.76 


and 


A = 6i + 4j 4- 3k 
B = 2i — 3j — 3k. 


A + B = 8i -I- j 


A - B = 4i + 7j + 6k. 

It should be emphasized here that the unit vectors i, j, and k are used for 
convenience. They are not essential; we can describe vectors and use them 
entirely in terms of their components: A <->(A x ,A y ,A z ). This is the approach of 
the two more powerful, more sophisticated definitions of vector discussed in 
the next section. However, i, j, and k emphasize the direction , which will be 
useful in Chapter 2. 

So far we have defined the operations of addition and subtraction of vectors. 
Three varieties of multiplication are defined on the basis of their applicability : 
a scalar or inner product in Section 1.3, a vector product peculiar to three- 
dimensional space in Section 1.4, and a direct or outer product yielding a 
second-rank tensor in Section 3.2. Division by a vector is not defined. See 
Exercises 4.2.21 and 22. 

EXERCISES 

1.1.1 Show how to find A and B, given A + B and A — B. 

1.1.2 The vector A whose magnitude is 10 units makes equal angles with the coordinate 
axes. Find A x , A y , and A z . 

1.1.3 Calculate the components of a unit vector that lies in the xy-plane and makes 
equal angles with the positive directions of the x- and y-axes. 

1.1.4 The velocity of sailboat A relative to sailboat B , v rel , is defined by the equation 
v re , = \ A — v B , where is the velocity of A and v B is the velocity of B. Determine 
the velocity of A relative to B if 

y A = 30 km/hr east 

y B = 40 km/hr north 

ANS. v rel = 50 km/hr, 53.1° south of east. 
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1.1.5 A sailboat sails for 1 hr at 4 km/hr (relative to the water) on a steady compass 
heading of 40° east of north. The sailboat is simultaneously carried along by a 
current. At the end of the hour the boat is 6.12 km from its starting point. The 
line from its starting point to its location lies 60° east of north. Find the x (east- 
erly) and y (northerly) components of the water’s velocity. 

ANS. v M = 2.73 km/hr, t> norlh = 0 km/hr. 

1.1.6 A vector equation can be reduced to the form A = B. From this show that the 
one vector equation is equivalent to three scalar equations. 

Assuming the validity of Newton’s second law F = ma as a vector equation, 
this means that a x depends only on F x and is independent of F y and F z . 

1.1.7 The vertices of a triangle A , B, and C are given by the points ( — 1 , 0, 2), (0, 1 , 0), 
and (1, — 1, 0), respectively. Find point D so that the figure ABDC forms a plane 
parallelogram. 

ANS . (2,0, -2). 

1.1.8 A triangle is defined by the vertices of three vectors. A, B, and C that extend 
from the origin. In terms of A, B, and C show that the vector sum of the successive 
sides of the triangle (AB 4- BC + CA) is zero. 

1.1.9 A sphere of radius a is centered at a point . 

(a) Write out the algebraic equation for the sphere. 

(b) Write out a vector equation for the sphere. 

ANS. (a) (x — x t ) 2 + (>’ — yi) z + (z — z,) 2 = a 2 . 
(b) r = r t + a. 

(a takes on all directions but has a fixed magnitude, a.) 

1 . 1.10 A corner reflector is formed by three mutually perpendicular reflecting surfaces. 
Show that a ray of light incident upon the corner reflector (striking all three 
surfaces) is reflected back along a line parallel to the line of incidence. 

Hint. Consider the effect of a reflection on the components of a vector describing 
the direction of the light ray. 

1.1.11 Hubble’s law. Hubble found that distant galaxies are receding with a velocity 
proportional to their distance from where we are on Earth. For the / th galaxy 

^ = H^i 

with us at the origin. Show that this recession of the galaxies from us does not 
imply that we are at the center of the universe. Specifically, take the galaxy 
at r x as a new origin and show that Hubble’s law is still obeyed. 


1.2 ADVANCED DEFINITIONS* 

In the preceding section vectors were defined or represented in two equiv- 
alent ways: (1) geometrically by specifying magnitude and direction, as with an 
arrow, and (2) algebraically by specifying the components relative to cartesian 
coordinate axes. The second definition is adequate for the vector analysis of 
this chapter. In this section two more refined, sophisticated, and powerful 


*This section is optional. It is not essential for the remaining sections of this 
chapter. 
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definitions are presented. First, the vector field is defined in terms of the 
behavior of its components under rotation of the coordinate axes. This trans- 
formation theory approach leads into the tensor analysis of Chapter 3. Second, 
the component definition of Section 1 . 1 is refined and generalized according to 
the mathematician’s concepts of vector and vector space. This approach leads 
to function spaces including the Hilbert space — Section 9.4. 

ROTATION OF THE COORDINATE AXES 

The definition of vector as a quantity with magnitude and direction breaks 
down in advanced work. On the one hand, we encounter quantities, such as 
elastic constants and index of refraction in anisotropic crystals, that have 
magnitude and direction but which are not vectors. On the other hand, our 
naive approach is awkward to generalize, to extend to more complex quantities. 
We seek a new definition of vector field, using our displacement vector r as a 
prototype. 

There is an important physical basis for our development of a new definition. 
We describe our physical world by mathematics, but it and any physical 
predictions we may make must be independent of our mathematical analysis. 
Some writers compare the physical system to a building and the mathematical 
analysis to the scaffolding used to construct the building. In the end the scaffold- 
ing is stripped off and the building stands. 

In our specific case we assume that space is isotropic; that is, there is no 
preferred direction or all directions are equivalent. Then the physical system 
being analyzed or the physical law being enunciated cannot and must not 
depend on our choice or orientation of the coordinate axes. 

Now we return to the concept of vector r as a geometric object independent 
of the coordinate system. Let us look at r in two different systems, one rotated 
in relation to the other. 

For simplicity we consider first the two-dimensional case. If the x-, y- 
coordinates are rotated counterclockwise through an angle cp, keeping r fixed 
(Fig. 1.7), we get the following relations between the components resolved in 
the original system (unprimed) and those resolved in the new rotated system 
(primed) : 


x' = xcos <p + ysincp, 
y' — — x sin <p 4- y cos cp 


( 1 . 8 ) 


We saw in Section 1 . 1 that a vector could be represented by the coordinates 
of a point ; that is, the coordinates were proportional to the vector components. 
Hence the components of a vector must transform under rotation as coordinates 
of a point (such as r). Therefore whenever any pair of quantities A x (x,y) and 
A y (x, y) in the xy-coordinate system is transformed into (A' x , A' y ) by this rotation 
of the coordinate system with 

A' x = A x cos ip A- A y sin <p 
A y = — A x sin(p A- A y cos(p, 


(1.9) 
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7 



FIG. 1.7 Rotation of cartesian coordinate axes about the z-axis 


we define 1 A x and A y as the components of a vector A. Our vector now is 
defined in terms of the transformation of its components under rotation of the 
coordinate system. If A x and A y transform in the same way as x and y, the 
components of the two-dimensional displacement vector, they are the compo- 
nents of a vector A. If A x and A y do not show this form invariance when the 
coordinates are rotated, they do not form a vector. 

The vector field components A x and A y satisfying the defining equations, 
Eq. 1.9, associate a magnitude A and a direction with each point in space. The 
magnitude is a scalar quantity, invariant to the rotation of the coordinate 
system. The direction (relative to the unprimed system) is likewise invariant to 
the rotation of the coordinate system (see Exercise 1.2.1). The result of all this 
is that the components of a vector may vary according to the rotation of the 
primed coordinate system. This is what Eq. 1.9 says. But the variation with the 
angle is just such that the components in the rotated coordinate system A x and 
A y define a vector with the same magnitude and the same direction as the 
vector defined by the components A x and A y relative to the x-, y-coordinate 
axes. (Compare Exercise 1.2.1.) The components of A in a particular coordinate 
system constitute the representation of A in that coordinate system. Equation 
1.9, the transformation relation, is a guarantee that the entity A is independent 
of the rotation of the coordinate system. 

To go on to three and, later, four dimensions, we find it convenient to use a 
more compact notation. Let 


lr rhe corresponding definition of a scalar quantity is S' = S, that is, invariant 
under rotation of the coordinates. 
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X-+Xi 

y^x 2 


( 1 . 10 ) 


a tl = cos cp, a 12 = sirup , 

a 21 = — sin<p, a 22 — cos(p. 

Then Eq. 1.8 becomes 


x\ =a 11 x 1 +a 12 x 2 , 

X 2 “ a 21 X l + a 22 X 2- 


(EH) 


(U2) 


The coefficient a {j may be interpreted as a direction cosine, the cosine of the 
angle between x[ and Xj ; that is, 


a 12 — cos(Xi,x 2 ) = sin<p, 

« J1= cos<*i,,,) = cosL + f)=- S i„ v . 


(1.13) 


The advantage of the new notation 2 is that it permits us to use the summation 
symbol and to rewrite Eqs. 1.12 as 


A = I a u x j9 *=1,2. (M4) 

j=i 

Note that i remains as a parameter that gives rise to one equation when it is 
set equal to 1 and to a second equation when it is set equal to 2. The index j, 
of course, is a summation index, a dummy index, and as with a variable of 
integration, j may be replaced by any other convenient symbol. 

The generalization to three, four, or N dimensions is now very simple. The 
set of N quantities, V j9 is said to be the components of an TV-dimensional vector, 
V, if and only if their values relative to the rotated coordinate axes are given by 


*7=1 a u V j9 i=l,2, (1.15) 

j- i 

As before, a tj is the cosine of the angle between x\ and x y Often the upper limit 
N and the corresponding range of i will not be indicated. It is taken for granted 
that the reader knows how many dimensions his or her space has. 

From the definition of a i3 as the cosine of the angle between the positive x\ 


2 The reader may wonder at the replacement of one parameter (p by four 
parameters a {j . Clearly, the a n do not constitute a minimum set of parameters. 
For two dimensions the four a i} are subject to the three constraints given in 
Eq. 1.18. The justification for the redundant set of direction cosines is the 
convenience it provides. Hopefully, this convenience will become more 
apparent in Chapters 3 and 4. For three dimensional rotations (9 a i} but only 
three independent) alternate descriptions are provided by: (1) the Euler angles 
discussed in Section 4.3, (2) quaternions, and (3) the Cayley-Klein parameters. 
These alternatives have their respective advantages and disadvantages. 
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direction and the positive Xj direction we may write (cartesian coordinates) 3 


dx'j _ Sxj 

lJ 8xj dx\ ’ 


(1.16) 


Note carefully that these are partial derivatives. By use of Eq. 1.16, Eq. 1.15 
becomes 


V' - V v — V dx,- 

The direction cosines a tj satisfy an orthogonality condition 

X a ij a ik = 3jk 


or, equivalently, 


^ a ji a ki ~ djk- 


(1.17) 


(1.18) 


(1.19) 


( 1 . 20 ) 


The symbol S jk is the Kronecker delta defined by 

S jk = 1 for j = k, 

5 jk = 0 for j =k k. 

The reader may easily verify that Eqs. 1.18 and 1.19 hold in the two-dimensional 
case by substituting in the specific from Eq. 1 . 1 1 . The result is the well-known 
identity sin 2 cp + cos 2 (p = 1 for the nonvanishing case. To verify Eq. 1.18 in 
general form, we may use the partial derivative forms of Eqs. 1.16 to obtain 


y dxj dx k _ y 8xj dx'j 
i dx- dx[ i dx[ dx k 


fa i 

dx L ' 


( 1 . 21 ) 


The last step follows by the standard rules for partial differentiation, assuming 
that Xj is a function of x[, x' 2 , x' 3 , and so on. The final result, dXj/8x k , is equal 
to 5 jk , since Xj and x k as coordinate lines (j i= k) are assumed to be perpendicular 
(two or three dimensions) or orthogonal (for any number of dimensions). 
Equivalently, we may assume that Xj and x k ( j =/= k) are totally independent 
variables. If j = k , the partial derivative is clearly equal to 1. 

In redefining a vector in terms of how its components transform under a 
rotation of the coordinate system, we should emphasize two points : 

1. This definition is developed because it is useful and appropriate in 
describing our physical world. Our vector equations will be independent of 
any particular coordinate system. (The coordinate system need not even be 
cartesian.) The vector equation can always be expressed in some particular 
coordinate system and, to obtain numerical results, we must ultimately express 
the equation in some specific coordinate system. 


3 Differentiate x\ ~Y, a tk x k w kh respect to Xj. See the discussion following 
Eq. 1.21. Section 4.3 provides an alternate approach. 
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2. This definition is subject to a generalization that will open up the branch 
of mathematics known as tensor analysis (Chapter 3). 

A qualification is also in order. The behavior of the vector components 
under rotation of the coordinates is used in Section 1.3 to prove that a scalar 
product is a scalar, in Section 1.4 to prove that a vector product is a vector, 
and in Section 1.6 to show that the gradient of a scalar, V^, is a vector. The 
remainder of this chapter proceeds on the basis of the less restrictive definitions 
of the vector given in Section 1.1. 

Vectors and Vector Space 

It is customary in mathematics to label an ordered triple of real numbers 
(Xj , x 2 , x 3 ) a vector x. The number x„ is called the nth component of vector x. 
The collection of all such vectors (obeying the properties that follow) form a 
three-dimensional real vector space. We ascribe five properties to our vectors : 
If x = (x l5 x 2 ,x 3 ) and y = (ji 9 y 29 y^ 

1. Vector equality: x = y means x f = y h i = 1, 2, 3 

2. Vector addition: x + y = z means x, + y ( = z h 
i= 1,2,3 

3. Scalar multiplication : ax^(ax 1 , ax 2 ,ax 3 ) (with 
a real) 

4. Negative of a vector: — x = (— l)x<->(~x 1 , — x 2 , 

-* 3 ) 

5. Null vector : There exists a null vector 0 *-> (0, 0, 0). 

Since our vector components are simply numbers, the following properties 
also hold : 

1. Addition of vectors is commutative: x + y = y + x. 

2. Addition of vectors is associative : (x + y) + z = 
x + (y + z). 

3. Scalar multiplication is distributive : 

a(x + y) = ax -f ay, also (a -f b)x = ax 4* bx. 

4. Scalar multiplication is associative: (< ab)x ~ a(bx). 

Further, the null vector 0 is unique as is the negative of a given vector x. 

So far as the vectors themselves are concerned this approach merely for- 
malizes the component discussion of Section 1.1. The importance lies in the 
extensions which will be considered in later chapters. In Chapter 4, we show that 
vectors form both an Abelian group under addition and a linear space with 
the transformations in the linear space described by matrices. Finally, and 
perhaps most important, for advanced physics the concept of vectors presented 
here may be generalized to (1) complex quantities, 4 (2) functions, and (3) an 
infinite number of components. This leads to infinite dimensional function 


4 The n-dimensional vector space of real 72 -tuples is often labeled R" and the 
n-dimensional vector space of complex 72-tuples is labeled C". 



SCALAR OR DOT PRODUCT 13 


spaces, the Hilbert spaces, which are, important in modern quantum theory. A 
brief introduction to function expansions and Hilbert space appears in Section 
9.4. 


EXERCISES 

1 . 2.1 (a) Show that the magnitude of a vector A, A — (A* 4- /I 2 ) 1/2 is independent 

of the orientation of the rotated coordinate system, 

(^+^) i / 2 = (^ 2 + <) 1 / 2 

independent of the rotation angle <p . 

This independence of angle is expressed by saying that A is invariant under 
rotations. 

(b) At a given point ( x,y ) A defines an angle a relative to the positive x-axis 
and a ' relative to the positive x'-axis. The angle from x to x' is <p. Show that 
A = A' defines the same direction in space when expressed in terms of its 
primed components, as in terms of its unprimed components; that is, 

ot' — a — (p. 

1 . 2.2 Prove the orthogonality condition == S jk . As a special case of this the 

direction cosines of Section 1 . 1 satisfy the relation 

cos 2 a + cos 2 p + cos 2 y = 1 , 

a result that also follows from Eq. 1 .7 a. 


1.3 SCALAR OR DOT PRODUCT 

Having defined vectors, we now proceed to combine them. The laws for 
combining vectors must be mathematically consistent. From the possibilities 
that are consistent we select two that are both mathematically and physically 
interesting. A third possibility is introduced in Chapter 3, in which we form 
tensors. 

The combination of AB cos 8, in which A and B are the magnitudes of two 
vectors and 0, the angle between them, occurs frequently in physics (Fig. 1.8). 



x 


FIG. 1.8 Scalar product A*B = 
AB cos 0 
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For instance, 


work = force x displacement x cos 6 

is usually interpreted as displacement times the projection of the force along 
the displacement. 

With such applications in mind, we define 

AB = A X B X + A y B y + A Z B Z = Y, A i B i (1.22) 

i 

as the scalar, dot, or inner product of A and B. The scalar product of two 
vectors is a scalar quantity. We note that from this definition A-B = B-A; 
the scalar product is commutative. The unit vectors i, j, and k satisfy the relations 


ii = jj = k-k = 1, 

whereas 

j.j = i-k = j-k = 0, 
j-i = k-i = k*j = 0. 

If we reorient our axes and let A define a new x-axis, 1 then 
A x — A, A y = 0, A z = 0 

and 


(1.22a) 

(1.22Z>) 


B x = BcosQ. 


Then by Eq. 1.22 


A*B = AB cos 6, (1.23) 

which may be taken as a second definition of scalar product. The component 
definition, Eq. 1.22, might be labeled an algebraic definition. Then Eq. 1.23 
would be a geometric definition. One of the most common applications of the 
scalar product in physics is in the calculation of work, W — F • s, the scalar 
product of force and displacement. 


EXAMPLE 1.3.1 


For the two vectors A and B of Example 1.1.1, A = 6i 4- 4j -f 3k, B = 
2i — 3 j — 3k, 

A*B = (12 — 12-9)= -9 

by Eq. 1 .22. In this case the projection of A on B (or B on A) is negative. Actually, 
|A| = (36 + 16 + 9) 1/2 = (61) 1/2 = 7.81, 

|B| = (4 + 9 + 9) 1/2 = (22) 1/2 = 4.69, 
and cos# = -0.246, 6 = 104.2°. 


lr The invariance of A ■ B under rotation of the coordinate axes is proved later 
in this section. 
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If A • B = 0 and we know that A ^ 0 and B =£ 0, then from Eq. 1 .23 cos 0 = 0 
or 0 = 90°, 270°, and so on. The vectors A and B must be perpendicular. 
Alternately, we may say A and B are orthogonal. The unit vectors i, j, and k 
are mutually orthogonal. To develop this notion of orthogonality one more 
step, suppose that n is a unit vector and r is a nonzero vector in the xy-plane ; 
that is, r = ix + \y (Fig. 1.9). If 

n-r = 0 

for all choices of r, then n must be perpendicular (orthogonal) to the xy-plane. 


n 



Often it is convenient to replace i, j, and k by subscripted unit vectors e m , 
m = 1, 2, 3 with i = e 1? and so on. Then Eqs. 1.22 a and b become 

e m -e„ = (5 mn - (1.22c) 

For m =/= n the unit vectors e m and e n are orthogonal. For m = n each vector is 
normalized to unity, that is, has unit magnitude. The set e w is said to be orthonor- 
mai A major advantage of Eq. 1.22c over Eqs. \22a and b is that Eq. 1.22c 
may readily be generalized to A-dimensional space: m, n = 1,2, . . . , N. 
Finally, we are picking sets of unit vectors e m that are orthonormal for con- 
venience — a very great convenience. The nonorthogonal situation is explored 
in Section 4.4, “Oblique Coordinates.” 

SCALAR PROPERTY 

We have not yet shown that the word scalar is justified or that the scalar 
product is indeed a scalar quantity. To do this, we investigate the behavior of 
A • B under a rotation of the coordinate system. By use of Eq. 1.15 
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A'xB'x + dyB) + ^2-®2 — Yj a xi^iTj a xjBj + Yj a yi^iY, a yjBj + X a zi^i Z a zjBj- 

* J i j i j 

(1.24) 

Using the indices A: and / to sum over .v. r. and z, we obtain 

Z^ = III««.W,, (1-25) 

fc l i j 

and, by rearranging the terms on the right-hand side, we have 

Z = X Z £ ( a u a ij)AiBj 

k i j l 

= Z EMA 0 . 26 ) 

» j 

= 

i 

The last two steps follow by using Eq. 1.18, the orthogonality condition of the 
direction cosines, and Eq. 1.20, which defines the Kronecker delta. The effect 
of the Kronecker delta is to cancel all terms in a summation over either index 
except the term for which the indices are equal. In Eq. 1.26 its effect is to set 
j = i and to eliminate the summation over j. Of course, we could equally well 
set i = j and eliminate the summation over i. Equation 1.26 gives us 

X^ = X^> (1-27) 

k i 

which is just our definition of a scalar quantity, one that remains invariant under 
the rotation of the coordinate system. 

In a similar approach which exploits this concept of invariance, we take 
C = A + B and dot it into itself. 


C • C = ( A + B) • (A + B) 

= A- A + BB + 2AB. 


(1.28) 


Since 


C-C-C 2 , (1.29) 

the square of the magnitude of vector C and thus an invariant quantity, we 
see that 


A*B = j(C 2 — A 2 — B 2 ), invariant. (1.30) 

Since the right-hand side of Eq. 1.30 is invariant — that is, a scalar quantity — 
the left-hand side, A • B, must also be invariant under rotation of the coordinate 
system. Hence A • B is a scalar. 

Equation 1.28 is really another form of the law of cosines which is 

C 2 - A 2 + B 2 + 2ABcos 6. (1.31) 

Comparing Eqs. 1.28 and 1.31, we have another verification of Eq. 1.23, or, 
if preferred, a vector derivation of the law of cosines (Fig. 1.10). 
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B 


A FIG. 1.10 The law of cosines 

An interesting illustration of the geometric interpretation of the scalar 
product is provided by an example from a branch of general relativity. Consider 
a four-dimensional sphere 

* 2 + y 2 + z 2 + w 2 = 1 

in x, y, z, w space. The surface of this four-dimensional sphere may be described 
by the vector r = (x,y,z 9 w) with the restriction that |r| = 1. It is possible to 
construct a unit vector t that is tangential to this four-dimensional sphere over 
its entire surface. As one possible example, 

t= O', -x,w, -z). 

The reader may verify that 

tt= 1, 

therefore unit magnitude, and 

tr = 0, 

therefore tangential, over the entire sphere. 

The two-dimensional analog exists but there is no three-dimensional analog. 
Hair growing out of a sphere cannot be combed down all over. There will be 
a cowlick. 

The dot product, given by Eq. 1.22, may be generalized in two ways. The 
space need not be restricted to three dimensions. In ^-dimensional space, 
Eq. 1.22 applies with the sum running from 1 to n.n may be infinity, with the 
sum then a convergent infinite series (Section 5.2). The other generalization 
extends the concept of vector to embrace functions. The function analog of a 
dot or inner product appears in Section 9.4. 



EXERCISES 


1.3.1 What is the cosine of the angle between the vectors 


and 


A = 3i -b 4j + k 


B = i — j + k? 



A NS. cos 0 = 0, 
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1.3.2 Two unit magnitude vectors e f and e y - are required to be either parallel or per- 
pendicular to each other. Show that e f • e 7 - provides an interpretation of Eq. 1.18, 
the direction cosine orthogonality relation. 

1.3.3 Given that (1) the dot product of a unit vector with itself is unity and (2) this 
relation is valid in all (rotated) coordinate systems, show that i' • i' = 1 (with the 
primed system rotated 45° about the z-axis relative to the unprimed) implies that 

i’j = 0. 

1 .3.4 The vector r, starting at the origin, terminates at and specifies the point in space 
(x, y, z). Find the surface swept out by the tip of r if 

(a) (r - a) • a = 0, 

(b) (r — a)*r = 0. 

The vector a is a constant (constant in magnitude and direction). 



The interaction energy between two dipoles of moments gj and \i 2 may be written 
in the vector form 

v = _ Hi- Pa i 3(n t t)(h 2 t) 

r 3 r 5 

and in the scalar form 

V — (2 cos 0j cos 9 2 — sin sin 9 2 cos <p ). 

r 5 

Here 9 l and 0 2 are the angles of m and relative to r, while (p is the azimuth of 
\i 2 relative to the fij — r plane. Show that these two forms are equivalent. 

Hint. Eq. 12.198 will be helpful. 

1.3.6 A pipe comes diagonally down the south wall of a building, making an angle 
of 45° with the horizontal. Coming into a corner, the pipe turns and continues 
diagonally down a west-facing wall, still making an angle of 45° with the horizontal. 
What is the angle between the south -wall and west-wall sections of the pipe? 

ANS. 120°. 


1.4 VECTOR OR CROSS PRODUCT 

A second form of vector multiplication employs the sine of the included 
angle instead of the cosine. For instance, the angular momentum of a body is 
defined as 

angular momentum = radius arm x linear momentum 

= distance x linear momentum x sin 0. 

For convenience in treating problems relating to quantities such as angular 
momentum, torque, and angular velocity, we define the vector or cross product 
as 
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C = A x B, 


with 

C = sin 0. (1.32) 

Unlike the preceding case of the scalar product, C is now a vector, and we assign 
it a direction perpendicular to the plane of A and B such that A, B, and C form a 
right-handed system. With this choice of direction we have 

AxB=— BxA, anticommutation. (1.32a) 

From this definition of cross product we have 

ixi = jxj = kxk = 0, (1 .326) 

whereas 

i x j = k, j x k = i, k x i = j 

and (1.32c) 

j x i = — k, k x j = -i, i x k = — j. 

Among the examples of the cross product in mathematical physics are the 
relation between linear momentum p and angular momentum L (defining 
angular momentum), 

L = r x p 

and the relation between linear velocity v and angular velocity co, 


v = (o x r. 

Vectors v and p describe properties of the particle or physical system. However, 
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t he position vector r is determined by the choice of the origin of the coordinates. 
This means that co and L depend on the choice of the origin. 

The familiar magnetic induction B is usually defined by the vector product 
force equation 1 

F m = qy x B. 

Here v is the velocity of the electric charge q and F M is the resulting force on 
the moving charge. 

The cross product has an important geometrical interpretation which we 
shall use in subsequent sections. In the parallelogram defined by A and B 
(Fig. 1.12) BsinO is the height if A is taken as the length of the base. Then 
| A x B| = ABsinQ is the area of the parallelogram. As a vector, A x B is the 
area of the parallelogram defined by A and B, with the area vector normal to 
the plane of the parallelogram. This suggests that area may be treated as a 
vector quantity. 



FIG. 1.12 Parallelogram representation of the vector product 

Parenthetically, it might be noted that Eq. 1.32c and a modified Eq. 1.32 ft 
form the starting point for the development of quaternions . Equation 1.32ft 
is replaced byixi = jxj = kxk = — 1 . 

An alternate definition of the vector product C = A x B consists in specifying 
the components of C : 

C x = A y B z - A z B y , 

C y = A Z B X — A X B Z , (1.33) 

C z = A x B y — A y B x , 
or 


The electric field E is assumed here to be zero. 
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C f = AjB k — A k Bj , i, y, k all different, (1.34) 

and with cyclic permutation of the indices i, j, and k. The vector product C 
may be conveniently represented by a determinant 2 

i j k 

C = A x A y A z . (1.35) 

^ ^z 

Expansion of the determinant across the top row reproduces the three com- 
ponents of C listed in Eq. 1.33. 

Equation 1.32 might be called a geometric definition of the vector product. 
Then Eq. 1.33 would be an algebraic definition. 

EXAMPLE 1.4.1 

With A and B given in Example 1.1.1, 

A = 6i 4- 4j 4- 3k, 

B = 2i — 3j — 3k, 

1 J k 

A x B = 6 4 3 

2 -3 -3 

= i(— 12 + 9) - j(- 18 - 6) + k(- 18 - 8) 

= — 3i 4- 24j - 26k. 

To show the equivalence of Eq. 1.32 and the component definition, Eq. 1.33, 
let us form A * C and B • C, using Eq. 1 .33. We have 

A • C = A • ( A x B) 

= A x (A y B 2 - A z B y ) 4- A y (A z B x ~ A X B Z ) 4- A z (A x B y - A y B x ) 

= 0. 

(1.36) 

Similarly, 

B • C = B • ( A xB) = 0. (1.37) 

Equations 1 .36 and 1 .37 show that C is perpendicular to both A and B (cos 0 = 0, 
6 = ± 90°) and therefore perpendicular to the plane they determine. The positive 
direction is determined by considering special cases such as the unit vectors 
i x j = k (C 2 = +A x B y ). 


2 See Section 4.1 for a summary of determinants. 
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The magnitude is obtained from 

(A x B) • (A x B) = A 2 B 2 — (A B ) 2 

= A 2 B 2 — A 2 B 2 cos 2 9 


— A 2 B 2 sin 2 0 . 


(1.38) 


Hence 

C = AB sin 9. (1.39) 

The big first step in Eq. 1.38 may be verified by expanding out in component 
form, using Eq. 1.33 for A x B and Eq. 1.22 for the dot product. From Eqs. 
1.36, 1.37, and 1.39 we see the equivalence of Eqs. 1.32 and 1.33, the two 
definitions of vector product. 

There still remains the problem of verifying that C = A x B is indeed a 
vector; that is, it obeys Eq. 1.15, the vector transformation law. Starting in a 
rotated (primed system) 

C[ = AjB k — A' k Bp /, j\ and k in cyclic order, 

= 'L a jt A l'L a km B m ~ Y, a kl A lY. a jm B m O- 40 ) 

l rti l m 

= Y ( a jl a km ~ a kl a jnd^lB m * 

l,m 

The combination of direction cosines in parentheses vanishes for m — /. We 
therefore have j and k taking on fixed values, dependent on the choice of /, 
and six combinations of / and m. If i = 3, then j — \,k — 2 (cyclic order), and 
we have the following direction cosine combinations 

a l\ a 22 “ ^21^12 = ^33> 

^13^21 ^23^1 1 =«32 , (1-41) 

^12 a 23 __ ^22^13 ~ a 31 


and their negatives. Equations 1.41 are identities satisfied by the direction 
cosines. They may be verified with the use of determinants and matrices 
(see Exercise 4.3.3). Substituting back into Eq. 1.40, 


C3 ~ a 33^1^2 4 " a 32^3^l + ^ 31 ^ 2^3 

“ a 33^2^\ ~ a 32^l^3 ~~ a 3\^3^2 


— a 31^1 + a 32^2 3 " a 33^3 
= Z«3n C »- 

n 


(1.42) 


By permuting indices to pick up C[ and C 2 , we see that Eq. 1.15 is satisfied 
and C is indeed a vector. It should be mentioned here that this vector nature of 
the cross product is an accident associated with the three-dimensional nature 
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of ordinary space. 3 It will be seen in Chapter 3 that the cross product may also 
be treated as a second-rank antisymmetric tensor ! 

If we define a vector as an ordered triple of numbers (or functions) as in the 
latter part of Section 1 .2, then there is no problem identifying the cross product 
as a vector. The cross-product operation maps the two triples A and B into a 
third triple C which by definition is a vector. 

We now have two ways of multiplying vectors; a third form appears in 
Chapter 3. But what about division by a vector? It turns out that the ratio 
B/A is not uniquely specified (Exercise 4.2.19) unless A and B are also required 
to be parallel. Hence division of one vector by another is not defined. 


EXERCISES 

1 .4.1 Two vectors A and B are given by 

A = 2i + 4j 4- 6k, 

B = 3i - 3j - 5k. 

Compute the scalar and vector products A • B and A x B. 

1.4.2 Show the equivalence of Eq. 1.32 and the component definition Eq. 1.33 by 
expanding A, B, and C in C = A x B in cartesian components. 

1 .4.3 Starting with C = A -f- B, show that C x C leads to 

A x B = — BxA. 

1.4.4 Show that 

(a) (A - B)(A + B) = A 2 - B 2 * , 

(b) (A - B) x (A + B) = 2A x B. 

The distributive laws needed here, 

A-(B + C) = AB + AC 

and 

Ax(B + C) = AxB + AxC, 

may easily be verified (if desired) by expansion in cartesian components. 

1.4.5 Given the three vectors, 

P = 3i + 2j - k, 

Q= -6i-4j + 2k, 

R = i - 2j - k, 

find two that are perpendicular and two that are parallel or antiparallel. 


3 Specifically Eq. 1.41 holds only for three-dimensional space. Technically, it 

is also possible to define a cross product in i? 7 , seven-dimensional space, but 

the cross product turns out to have unacceptable (pathological) properties. 
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1 . 4.6 

1 . 4.7 

1 . 4.8 

1 . 4.9 

1 . 4.10 

1 . 4.11 

1 . 4.12 

1 . 4.13 

1 . 4.14 


If P = i P x -f }P y and Q = i Q x 4 j Q y are any two nonparallel (also nonantiparallel) 
vectors in the xy-plane, show that P x Q is in the z-direction. 

Prove that (A x B) • (A x B) = (AB) 2 — (A • B) 2 . 

Using the vectors 

P = icos 9 4 jsin 0, 

Q = icos <p — jsin<p, 

R — icos cp 4- jsin<p, 

prove the familiar trigonometric identities 

sin(0 4 <p) = sin 9 cos cp + cos 9 sin cp, 
cos (6 + <p) = cos 0 cos (p — sin 9 sin cp. 


(a) Find a vector A that is perpendicular to 

U = 2i 4 j - k, 

V = i — j 4 k. 

(b) What is A if, in addition to this requirement, we also demand that it have 
unit magnitude? 

If four vectors a, b, c, and d all lie in the same plane, show that 

(a x b) x (c x d) = 0. 

Hint . Consider the directions of the cross-product vectors. 

The coordinates of the three vertices of a triangle are (2, 1 , 5), (5, 2, 8), and (4, 8, 2). 
Compute its area by vector methods. 

The vertices of parallelogram ABCD are (1,0,0), (2, —1,0), (0, — 1,1), and 
(— 1,0, 1) in order. Calculate the vector areas of triangle ABD and of triangle 
BCD. Are the two vector areas equal? 

A NS. Areata = — -j(i 4 j 4 2k). 

The origin and the three vectors A, B, and C (all of which start at the origin) 
define a tetrahedron. Taking the outward direction as positive, calculate the total 
vector area of the four tetrahedral surfaces. 

Note . In Section 1.11 this result is generalized to any closed surface. 


Find the sides and angles of the spherical triangle ABC defined by the three vectors 

A = (1,0,0), 

1 


and 


B V2’°’v^’ 


C= 0, 


1 1 


’V2V v 

Each vector starts from the origin (Fig. 1.13). 



FIG. 1.13 Spherical triangle 


1.4.15 Derive the law of sines : 

sin a _ sin /? _ sin y 

-\a\~W~W 



1 .4.1 6 The magnetic induction B is defined by the Lorentz force equation 


¥ = q(y x B). 

Carrying out three experiments, we Find that if 
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F 

v = i, — = 2k - 4j, 

q 

F , 

v = j, — == 4i — k, 

q 

and 

F 

v = k, — = j — 2i, 

q 

From the results of these three separate experiments calculate the magnetic 
induction B. 


1.5 TRIPLE SCALAR PRODUCT, TRIPLE VECTOR 
PRODUCT 

TRIPLE SCALAR PRODUCT 

Sections 1.3 and 1.4 cover the two types of multiplication of interest here. 
However, there are combinations of three vectors, A(B x C) and A x (B x C), 
which occur with sufficient frequency to deserve further attention. The com- 
bination 


A (B x C) 

is known as the triple scalar product. BxC yields a vector which, dotted into 
A, gives a scalar. We note that (A*B) x C represents a scalar crossed into a 
vector, an operation that is not defined. Hence, if we agree to exclude this 
undefined interpretation, the parentheses may be omitted and the triple scalar 
product written A • B x C. 

Using Eq. 1.33 for the cross product and Eq. 1.22 for the dot product, we 
obtain 

A B x C = A x (B y C z - B z C y ) + A y (B z C x - B X C Z ) + A z (B x C y - B y C x ) 

= B-C x A = C* A x B 

= — A*C x B = — C-BxA — -B'AxC, and so on. 

(1.43) 

The high degree of symmetry present in the component expansion should be 
noted. Every term contains the factors A h B p and C k . If ij , and k are in cyclic 
order (x,y,z), the sign is positive. If the order is anticyclic, the sign is negative. 
Further, the dot and the cross may be interchanged, 

A-B x C = A x B-C (1.44) 

A convenient representation of the component expansion of Eq. 1 .43 is provided 
by the determinant 

A x A y A z 
B x B y B z 

C x C y C z 


A-B x C = 


(1.45) 
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The rules for interchanging rows and columns of a determinant 1 provide an 
immediate verification of the permutations listed in Eq. 1.43, whereas the 
symmetry of A, B, and C in the determinant form suggests the relation given in 
Eq. 1.44. 

The triple products encountered in Section 1.4, which showed that A x B 
was perpendicular to both A and B, were special cases of the general result 
(Eq. 1.43). 

The triple scalar product has a direct geometrical interpretation. The three 
vectors A, B, and C may be interpreted as defining a parallelepiped (Fig. 1.14). 


B x C| = BC sin 6 

= area of parallelogram base. 


(1.46) 


The direction, of course, is normal to the base. Dotting A into this means multi- 
plying the base area by the projection of A onto the normal, or base times height. 
Therefore 


A • B x C = volume of parallelepiped defined by A, B, and C. 





EXAMPLE 1.5.1 A parallelepiped 
For 


A = i + 2j - k, 

B = j + k, 

C = i-j, 


See Section 4.1 for a summary of the properties of determinants. 
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1 

0 

1 


2 -1 


AB x C = 


1 1 

-1 0 


(1.47) 


By expansion by minors across the top row the determinant equals 
1(0+ 1) — 2(0 — 1)- 1(0- 1) = 4. 

This is the volume of the parallelepiped defined by A, B, and C. The reader 
should note that A * B x C may sometimes turn out to be negative ! This 
problem and its interpretation are considered in Chapter 3. 

The triple scalar product finds an interesting and important application 
in the construction of a reciprocal crystal lattice. Let a, b, and c (not necessarily 
mutually perpendicular) represent the vectors that define a crystal lattice. The 
distance from one lattice point to another may then be written 

r = n a a + n b b + n c c, (1.48) 

with n a , n b , and n c taking on integral values. With these vectors we may form 


bxc . , exa . axb 

— L— > b =— T * c = , 

a*b x c a* b x c a*bxc 


(1.48a) 


We see that a' is perpendicular to the plane containing b and c and has a magni- 
tude proportional to a" 1 . In fact, we can readily show that 

a'*a = b'-b = c'-c = 1, (1.486) 


whereas 


a'-b = a-c = b-a = b-c = c-a = c*b = 0. (1.48c) 

It is from Eqs. 1.486 and 1.48c that the name reciprocal lattice is derived. The 
mathematical space in which this reciprocal lattice exists is sometimes called 
a Fourier space, on the basis of relations to the Fourier analysis of Chapters 
14 and 15. This reciprocal lattice is useful in problems involving the scattering 
of waves from the various planes in a crystal. Further details may be found in 
R. B. Leighton’s Principles of Modern Physics , pp. 440-448 [New York: 
McGraw-Hill (1959)]. We encounter the reciprocal lattice again in an analysis 
of oblique coordinate systems, Section 4.4. 


TRIPLE VECTOR PRODUCT 

The second triple product of interest is A x (B x C). Here the parentheses 
must be retained, as may be seen by considering the special case 

i x (i x j) = i x k = -j (1.49) 

but 


(i x i) x j = 0. 

The fact that the triple vector product is a vector follows from our discussion 
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of vector product. Also, we see that the direction of the resulting vector is 
perpendicular to A and to B x C. The plane defined by B and C is perpendicular 
to B x C and so A x (B x C) lies in this plane. Specifically, if B and C lie in 
the xy-plane, then B x C is in the z-direction and A x (B x C) is back in the 
xy-plane (Fig. 1.15). This means that A x (B x C) will be a linear combination 
of B and C. We find that 

A x (B x C) = B(A • C) — C(A * B), (1.50) 

a relation sometimes known as the BAC-CAB rule. This result may be verified 
by the direct though not very elegant method of expanding into cartesian 
components (see Exercise 1.5.2). 


z 



FIG. 1.15 B and C are in the 
xy-plane. B x C is perpen- 
dicular to the jcy-plane and is 
shown here along the z-axis. 
Then A x (B x C) is perpen- 
dicular to the z-axis and there- 
fore is back in the xy- plane. 


An alternate derivation using the Levi-Civita s ijk of Section 3.4 is the topic 
of Exercise 3.4.8. 

The BAC-CAB rule is probably the single most important vector identity. 
Because of its frequent use in problems and in future derivations, the rule 
probably should be memorized. 

It might be noted here that as vectors are independent of the coordinates 
so a vector equation is independent of the particular coordinate system. The 
coordinate system only determines the components. If the vector equation 
can be established in cartesian coordinates, it is established and valid in any 
of the coordinate systems to be introduced in Chapter 2. 


EXAMPLE 1.5.2 A triple vector product 

By using the three vectors given in Example 1.5.1, we obtain 
A x (B x C) = (j I k)(l - 2) - (i - j)(2 - 1) 
= -i-k 


by Eq. 1.50. In detail, 
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B x C = 


i j k 
0 1 1 
1 -1 0 


= i + j — k 


and 


A x (B x C) = 


i j k 

1 2 -1 

1 1 -1 


= -i-k. 


Other, more complicated, products may be simplified by using these forms 
of the triple scalar and triple vector products. 


EXERCISES 

1.5.1 z 



One vertex of a glass parallelepiped is at the origin. The three adjacent vertices 
are at (3,0,0), (0,0,2), and (0,3, 1). All lengths are in centimeters. Calculate the 
number of cubic centimeters of glass in the parallelepiped by using the triple 
scalar product. 

1 .5.2 Verify the expansion of the triple vector product 

A x (B x C) = B(A • C) — C(A • B) 
by direct expansion in cartesian coordinates. 

1 .5.3 Show that the first step in Eq. 1.38, which is 

(A x B) -(A x B) = A 2 B 2 - (A • B) 2 , 
is consistent with the BAC-CAB rule for a triple vector product. 
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1 .5.4 Given the three vectors A, B, and C, 

A = i + j, 

B = j + k, 

C = i - k. 

(a) Compute the triple scalar product, A * B x C. Noting that A = B 4- C, give 
a geometric interpretation of your result for the triple scalar product. 

(b) Compute A x (B x C). 

1.5.5 The angular momentum L of a particle is given by L = r x p = mt x v, where p 
is the linear momentum. With linear and angular velocity related by v = co x r, 
show that 

L = mr 2 [co — r 0 (r 0 •<*>)]. 

Here r 0 is a unit vector in the r direction. For r * co = 0 this reduces to L = /co, 
with the moment of inertia I given by mr 2 . In Section 4.6 this result is generalized 
to form an inertia tensor. 

1.5.6 The kinetic energy of a single particle is given by T -\mv 2 . For rotational 
motion this becomes jifi( co x r) 2 . Show that 

T = jm[r 2 co 2 — (r - co) 2 ]. 

For r * co = 0 this reduces to T = \l<x> 2 with the moment of inertia I given by mr 2 . 

1.5.7 Show that 

a x (b x c) + b x (c x a) + c x (a x b) = 0. 

1.5.8 A vector A is decomposed into a radial vector A r and a tangential vector A r 
If r 0 is a unit vector in the radial direction, show that 

(a) A r = r 0 (A*r 0 ) 
and 

(b) A f = — r 0 x (r 0 x A). 

1.5.9 Prove that a necessary and sufficient condition for the three (nonvanishing) 
vectors A, B, and C to be coplanar is the vanishing of the triple scalar product 

AB x C = 0. 


1.5.10 Three vectors A, B, and C are given by 

A = 3i - 2j + 2k, 

B ~ 6i -b 4j — 2k, 

C — — 31 — 2j — 4k. 

Compute the values of A • B x C and A x (B x C), C x (A x B) and B x 
(C x A). 


1.5.11 Vector D is a linear combination of three noncoplanar (and nonorthogonal) 
vectors : 


D — ciA. + bB T cC. 

Show that the coefficients are given by a ratio of triple scalar products, 

D • B x C 
AB x C’ 


and so on. 
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1.5.12 Show that 

(A x B)-(C x D) = (A • C) (B • D) — (A • D) (B • C). 

1.5.13 Show that 


(A x B) x (C x D) = (A-B x D)C - (A B x C)D. 


1 .5.1 4 For a spherical triangle such as pictured in Fig. 1.13 show that 

sin A _ sin B _ sinC 
sini?C sinC4 sin AB 

Here sin^ is the sine of the included angle at A while BC is the side opposite 
(in radians). 

Hint. Exercise 1.5.13 will be useful. 


1.5.15 Given 


, b x c 

a = , 

a*b x c 


c x a 
a*bxc’ 


show that 

(a) x'-y = d xy , (x,y = a,b,c), 

(b) a'-b'xc'^fa-bxc) \ 


(c) 


b' x c' 

a = . 

a' -b' x c' 


, a x b 

c = 

a*b x c 


and 


a-b x c =f= 0, 


1.5.16 If x' * y = S xy , (x, y = a, b, c), prove that 

, b x c 
a = . 

a-b x c 

(This is the converse of Problem 1.5.15.) 

1.5.17 Show that any vector V may be expressed in terms of the reciprocal vectors 
a', b', c' by 

V = (V-a)a / + (V-b)b' + (V-c)c / . 

1 .5.18 An electric charge q t moving with velocity v t produces a magnetic induction 
B given by 


B = —q t Vl r ° (mks units). 

4n H 

where r 0 points from q x to the point at which B is measured (Biot and Savart law). 

(a) Show that the magnetic force on a second charge q 2 , velocity v 2 , is given 
by the triple vector product 

F 2 = ^0.2i|2 V2 X ( V xfo ) 

471 H 

(b) Write out the corresponding magnetic force that q 2 exerts on q x . Define 
your unit radial vector. How do Fj and F 2 compare? 

(c) Calculate F t and F 2 for the case of q x and q 2 moving along parallel tra- 
jectories side by side. 

ANS. (b) F, = -Mv, x (v 2 x r 0 ). 

471 r z 

In general, there is no simple relation between F x and 
F 2 . Specifically, Newton’s third law, Fj = — F 2 , does not 
hold. 
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(c) 


bl ~4n r 2 VI °- 
Mutual attraction. 


-f 2 . 


1.6 GRADIENT, V 


Suppose that <p{x 9 y 9 z ) is a scalar point function, that is, a function whose 
value depends on the values of the coordinates (x 9 y, z). As a scalar, it must have 
the same value at a given fixed point in space, independent of the rotation of 
our coordinate system, or 


ip\x\,x' 2 ,x\) = (p(x 1 ,x 2 ,x 3 ). (1.51) 

By differentiating with respect to x- we obtain 

d<P'{x\ , x' 2 ,x 3 ) = d<p(*i , x 2 , x 3 ) 
dx': dx- 

(1.52) 

_ y ^ x i _ y „ j^P 
jSxjdx'i y iJ dxj 

by the rules of partial differentiation and Eq. 1.16. But comparison with Eq. 
1.17, the vector transformation law, now shows that we have constructed a 
vector with components dcpjdxj. This vector we label the gradient of cp. 

A convenient symbolism is 


or 


*7 .dtp , .dtp , , dtp 
dx d y dz 


V = 


._fl_ ._5 ._5 

*dx *dy dz’ 


(1.53) 


(1.54) 


\cp (or del cp) is our gradient of the scalar cp , whereas V (del) itself is a vector 
differential operator (available to operate on or to differentiate a scalar cp). It 
should be emphasized that this operator is a hybrid creature that must satisfy 
both the laws for handling vectors and the laws of partial differentiation. 


EXAMPLE 1.6.1 The Gradient of a Function of r. 

Let us calculate the gradient of f{r) = f(^jx 2 -j-y 2 + Z 2 ). 

XI \ *d/( r ) , .df(r) 1 df(r) 

V/(r) = i-^ + J-^ + 

dx dy dz 

Now f(r) depends on x through the dependence of r on x. Therefore 1 


1 This is a special case of the chain rule of partial differentiation : 

df(r,9 i( p) dfdr | dfdd t dfdcp 

dx dr dx d6 dx dcp dx 

Here dfjdd — dfld<p = 0, df/dr -*■ dffdr . 
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dfir) _ df(r) dr 
dx dr 6x ’ 

From r as a function of x, y, z 

dr ^ d(x 2 +y 2 + z 2 ) 1/2 ^ x = x 

dx dx (x 2 -f y 2 + z 2 ) 1/2 r ’ 

Therefore 

df(r) _ df(r) x 
dx dr r' 

Permuting coordinates (x -► y-*z, z -+ x) to obtain the y and z derivatives, 

we get 

y/'(r) = (ix4-j.v + k 2 ) 1 4 / 

r dr 


= rdf 

r dr 



Here r 0 is a unit vector (r jr) in the positive radial direction. The gradient of a 
function of r is a Vector in the (positive or negative) radial direction. In Section 
2.5 r 0 is seen as one of the three orthonormal unit vectors of spherical polar 
coordinates. 


A GEOMETRICAL INTERPRETATION 

One immediate application of Sip is to dot it into an increment of length 

dx = i dx -f j dy + k dz. (1.55) 


Thus we obtain 

(V<p) • dx = dx + ~~dy + ~~dz (1.56) 

dx dy dz 


= dtp, 

the change in the scalar function <p corresponding to a change in position dx. 
Now consider P and Q to be two points on a surface (p(x,y, z) = C, a constant. 
These points are chosen so that Q is a distance dx from P. Then moving from 
P to Q , the change in q>(x 9 y, z) = C is given by 


dip = (\ip)*dx 
= 0 , 


(1.57) 


since we stay on the surface (p{x, y, z) — C. This shows that \cp is perpendicular 
to dx. Since dx may have any direction from P as long as it stays in the surface 
cp , point Q being restricted to the surface, but having arbitrary direction, \ip is 
seen as normal to the surface ip = constant (Fig. 1.16). 



GRADIENT, V 35 



FIG. 1.16 The length increment dr is required to stay on the surface (p = C. 


If we now permit dr to take us from one surface ip = C l to an adjacent 
surface cp = C 2 (Fig. 1.17a), 


dip — C 2 — C\ — AC 
= (Sip) -dr. 


(1.58) 


For a given dip , \dr \ is a minimum when it is chosen parallel to Sip (cos 0=1); 
or, for a given \dr\, the change in the scalar function ip is maximized by choosing 
dr parallel to Sip. This identifies Sip as a vector having the direction of the 
maximum space rate of change of q> , an identification that will be useful in 
Chapter 2 when we consider noncartesian coordinate systems. 

This identification of Sip may also be developed by using the calculus of 
variations subject to a constraint, Exercise 17.6.9. 


EXAMPLE 1.6.2 


As a specific example of the foregoing, and as an extension of Example 1.6.1, 
we consider the surfaces consisting of concentric spherical shells, Fig. 1.176. 
We have 

(p(x,y,z) = ( x 2 + y 2 + z 2 ) 1/2 = r t = C h 

where r t is the radius equal to C h our constant. AC = Aip = Ar h the distance 
between two shells. From Example 1.6.1 

Vq>(r) = r o~l~ = r o- 
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The gradient is in the radial direction and is normal to the spherical surface 
9 = C. 

The gradient of a scalar is of extreme importance in physics in expressing 
the relation between a force field and a potential field. 

force = — V (potential). (1.59) 

This is illustrated by both gravitational and electrostatic fields, among others. 
Readers should note that the minus sign in Eq. 1.59 results in water flowing 
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downhill rather than uphill! We reconsider Eq. 1.59 in a broader context in 
Section 1.13. 

EXERCISES 

1 .6.1 If S(x,y, z) = (x 2 + y 2 + z 2 )‘ 3/2 , find 

(a) VS at the point (1,2,3); 

(b) the magnitude of the gradient of S, | VS | at (1, 2, 3) ; 
and 

(c) the direction cosines of VS at (1 , 2, 3). 

1.6.2 (a) Find a unit vector perpendicular to the surface 

x 2 +y 2 + z 2 = 3 

at the point (1, 1, 1). 

(b) Derive the equation of the plane tangent to the surface at (1, 1, 1). 

A NS. (a) (i+j + k)/V3. 
(b) x + y + z = 3. 

1 .6.3 Given a vector r 12 = i(x t — x 2 ) + jO^ — y 2 ) + k (z x — z 2 ), show that Vfr 12 (gra- 
dient with respect to x x , y u and z l of the magnitude r X2 ) is a unit vector in the 
direction of r 12 . 

1.6.4 If a vector function F depends on both space coordinates (x, y, z) and time t, show 
that 

d¥ = (dr-\)¥ + ^-dt. 

at 

1.6.5 Show that V(wr) = v\u + wVr, where u and v are differentiable scalar functions 
of x, y, and z. 

1.6.6 (a) Show that a necessary and sufficient condition that u(x,y,z) and v(x,y,z) 

are related by some function /(«, v) = 0 is that (Vw) x (V^) = 0. 

(b) If u — u(x,y) and v — v(x,y), show that the condition (Vw) x (Vr) = 0 leads 
to the two-dimensional Jacobian 

du du 

sM- f f 

\x,y/ dv dv 
dx dy 

The functions u and v are assumed differentiable. 


1.7 DIVERGENCE, V 

Differentiating a vector function is a simple extension of differentiating 
scalar quantities. Suppose r(/) describes the position of a satellite at some time 
t. Then, for differentiation with respect to time, 

d*jt) = ljm r(t + At) - r(l) 
dt At~*o At 


-0. 


= v, linear velocity. 
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y 



Graphically, we again have the slope of a curve, orbit, or trajectory, as shown 
in Fig. 1.18. 

If we resolve r(f) into its cartesian components, dr/dt always reduces directly 
to a vector sum of not more than three (for three-dimensional space) scalar 
derivatives. In other coordinate systems (Chapter 2) the situation is a little 
more complicated, for the unit vectors are no longer constant in direction. 
Differentiation with respect to the space coordinates is handled in the same 
way as differentiation with respect to time, as seen in the following paragraphs. 

In Section 1.6 V was defined as a vector operator. Now, paying careful 
attention to both its vector and its differential properties, we let it operate on 
a vector. First, as a vector we dot it into a second vector to obtain 


V -V = 


dK 

dx 


8V y 

dy 


dz ’ 


(1.60) 


known as the divergence of V. This is a scalar, as discussed in Section 1.3. 


EXAMPLE 1.7.1 


Calculate V • r. 

v ' r= ( i l; +i | +k l)' < “ +i> ' +k -' ) 

_ dx dy dz 
dx dy dz 

Vr = 3. 
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EXAMPLE 1.7.2 


Generalizing Example 1.7.1, 

V-r/(r) = -^[x :/(r)] + + J;|>/<>)] 


= 3/(r) + 


r dr r dr r dr 


= 3/W + rf. 

The manipulation of the partial derivatives leading to the second equation in 
Example 1.7.2 is discussed in Example 1.6.1. 

In particular, if f(r) = r"' 1 , 

V •it * -1 = V *r 0 r n 

= 3r n ~ l + («- l)^” 1 (1.60a) 

= (« + 2)r”' 1 . 

This divergence vanishes for n = —2, an important fact in Section 1.14. 


A PHYSICAL INTERPRETATION 

To develop a feeling for the physical significance of the divergence, consider 
V*(pv) with v(x,>>,z), the velocity of a compressible fluid and p(x,j;,z), its 
density at point (x,y 9 z). If we consider a small volume dxdydz (Fig. 1.19), the 
fluid flowing into this volume per unit time (positive x-direction) through the 
face EFGH is (rate of flow in) EFGH = pv x \ x=0 dydz. The components of the flow 
pv y and pv z tangential to this face contribute nothing to the flow through this 
face. The rate of flow out (still positive x-direction) through face ABCD is 
P v x\ x =dxdy dz . To compare these flows and to find the net flow out, we expand 
this last result in a Maclaurin series 1 , Section 5.6. This yields 

(rate of flow out)^^ = pv x \ x ^ dx dydz 

= pv x + l x {pv x )dx 

Here the derivative term is a first correction term allowing for the possibility 
of nonuniform density or velocity or both 2 . The zero-order term pv x | x==0 
(corresponding to uniform flow) cancels out. 


dy dz. 


1 A Maclaurin expansion for a single variable is given by Eq. 5.88, Section 5.6. 
Here we have the increment x of Eq. 5.88 replaced by dx . We show a partial 
derivative with respect to x since pv x may also depend on y and z. 

2 Strictly speaking, pv x is averaged over face EFGH and the expression 
pv x + {djdx)(pv x )dx is similarly averaged over face ABCD. Using an arbi- 
trarily small differential volume, we find that the averages reduce to the values 
employed here. 
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z 



FIG. 1.19 Differential rectangular parallelepiped (in first or positive octant) 


Net rate of flow out L = — ( pv x )dxdydz . 

ox 


Equivalently, we can arrive at this result by 

pvJAx, 0, 0) - pyjfh 0, 0) d[pv x (x,y,z)] 

Ax dx 

Now the x-axis is not entitled to any preferred treatment. The preceding result 
for the two faces perpendicular to the x-axis must hold for the two faces 
perpendicular to the y-axis, with x replaced by y and the corresponding changes 
for y and z: y -> z, z -► x. This is a cyclic permutation of the coordinates. A 
further cyclic permutation yields the result for the remaining two faces of our 
parallelepiped. Adding the net rate of flow out for all three pairs of surfaces of 
our volume element, we have 


net flow out 

(per unit time) = 


SOS 
-(pv x ) + -(pv y ) + -(pv z ) dxdydz 


= \ •(py) dxdydz. 


( 1 . 61 ) 


Therefore the net flow of our compressible fluid out of the volume element 
dxdydz per unit volume per unit time is V • (pv). Hence the name divergence. A 
direct application is in the continuity equation 

^ + V-(pv) = 0, (1.62) 


which simply states that a net flow out of the volume results in a decreased 
density inside the volume. Note that in Eq. 1.62 p is considered to be a possible 
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function of time as well as of space: p(x,y,z,t). The divergence appears in a 
wide variety of physical problems, ranging from a probability current density 
in quantum mechanics to neutron leakage in a nuclear reactor. 

The combination V • (/V), in which / is a scalar function and V a vector 
function, may be written 


V.(/ v ) = |-( / F,) + |. W ) + | W) 


= Vr + f^ + d -Lv +f ^ 

8x X ^ J dx dy y J dy 


dz z J dz 


(1.62 a) 


= (V/)-V+/V-V, 

which is just what we would expect for the derivative of a product. Notice that 
V as a differential operator differentiates both / and V ; as a vector it is dotted 
into V (in each term). 

If we have the special case of the divergence of a vector vanishing, 

V *B = 0, (1.63) 

the vector B is said to be solenoidal, the term coming from the example in 
which B is the magnetic induction and Eq. 1.63 appears as one of Maxwell’s 
equations. When a vector is solenoidal it may be written as the curl of another 
vector known as the vector potential. In Section 1.13 we shall calculate such a 
vector potential. 


EXERCISES 

1 .7.1 For a particle moving in a circular orbit r = ir cos cot + jr sin c at, 

(a) evaluate r x r. 

(b) Show that r ,+ <u 2 r = 0. 

The radius r and the angular velocity co are constant. 

A NS. (a) k cor 2 . 

Note, r = dr/dt , if — d 2 rjdt 2 . 

1 .7.2 Vector A satisfies the vector transformation law, Eq. 1.15. Show directly that its 
time derivative dA/dt also satisfies Eq. 1.15 and is therefore a vector. 

1.7.3 Show, by differentiating components, that 

xx d x * dA _ . dB 

<*> s (a ' b, ‘t- b+a -^- 

su\d dA dB 

(b) —(A xB) — — -xB + Ax — , 
dt dt dt 

just like the derivative of the product of two algebraic functions. 

1.7.4 In Chapter 2 it will be seen that the unit vectors in noncartesian coordinate systems 

are usually functions of the coordinate variables, e,- = hut |e,| = 1. 

Show that either deJ^qj = 0 or dejdqj is orthogonal to e f . 

1.7.5 Prove V • (a x b) = b • V x a — a • V x b. 

Hint. Treat as a triple scalar product. 



42 VECTOR ANALYSIS 


1 . 7.6 The electrostatic field of a point charge q is 

E __ 9 To 
4jre 0 r 2 

Calculate the divergence of E. What happens at the origin? 


1.8 CURL, Vx 

Another possible operation with the vector operator V is to cross it into a 
vector. We obtain 


V x V = i 


dy 


_d_ 

6x 

K 


J 

d_ 

dy 


— V 

dz ' 


k 

d_ 

dz 

K 


+ j 


dz 


v *-£ K]+k 


— V 
dx y 


dy 


(1.64) 


which is called the curl of V. In expanding this determinant form or in any 
operation with V, we must consider the derivative nature of V. Specifically, 
V x V is defined only as an operator, another vector differential operator. It is 
certainly not equal, in general, to —V x V. 1 In the case of Eq. 1.64 the deter- 
minant must be expanded from the top down so that we get the derivatives as 
shown in the middle portion of Eq. 1 .64. If V is crossed into the product of a 
scalar and a vector, we can show 


V x (/V) |* = 


dy 


(.m - Urn 


dz 


= [ f d Jz + V V 

[J dy dy 2 


_ f^Xl d/ y 
J dz dz y 


(1.65) 


= /Vx V - (V/) x V| x . 


If we permute the coordinates x y,y -+ z,z x to pick up the y-component 

and then permute them a second time to pick up the z-component. 


V x (/V) =/? x V + (V/) x V, 


( 1 . 66 ) 


which is the vector product analog of Eq. 1 .62 a. Again, as a differential operator 
V differentiates both / and V. As a vector it is crossed into V (in each term). 


EXAMPLE 1.8.1 


Calculate V x r/(r) 
By Eq. 1.66 


1 In this same spirit, if A is a differential operator, it is not necessarily true 
that Ax A = 0. Specifically, for the quantum mechanical angular momentum 
operator , L = — i(r x V), we find that LxL = iL. 
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V x r f(r) = /(r)V x r + [V/(r)] x r. 


First, 


V x r = 


= 0 . 


i j k 

d_ d_ d_ 

| dx 8y 8z 
x y z 

Second, using V/(r) = r 0 (df/dr) (Example 1.6.1), we obtain 

v X rf(r) = Jr° x r = °. 


(1.67) 


( 1 . 68 ) 


(1.69) 


The vector product vanishes, since r = r Q r and r 0 x r 0 = 0. 

To develop a better feeling for the physical significance of the curl, we 
consider the circulation of fluid around a differential loop in the xy-plane, 
Fig. 1.20. 

Although the circulation is technically given by a vector line integral J V • dX 
(Section 1.10), we can set up the equivalent scalar integrals here. Let us take 
the circulation to be 


circulation! 23 4 = 


V x (x,y)dX x + 


V y (x,y)dl y 


+ K(x,y)dX x + 

3 


(1.70) 


VAx,y)dX y . 


The numbers 1,2, 3, and 4 refer to the numbered line segments in Fig. 1.20. 
In the first integral dX x = + dx but in the third integral dX x = —dx because 
the third line segment is traversed in the negative x-direction. Similarly, dl y — 
-\~dy for the second integral, — dy for the fourth. Next, the integrands are 
referred to the point (x 0 ,j 0 ) with a Taylor expansion 2 taking into account the 



2 V y (x 0 + dx,y 0 ) = V y (x 0 ,y 0 ) + 



dx + 


The higher-order terms will drop out in the limit as dx-* 0. A correction term 
for the variation of V y with y is canceled by the corresponding term in the 
fourth integral (see Section 5.6). 
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displacement of line segment 3 from 1 and 2 from 4. For our differential line 
segments this leads to 


circulation! 234 = V x (x 09 y 0 )dx + 


dV 

v y (x 0 ,y 0 ) + dx 


dy 


+ 


K(x„,y 0 ) + ^dy 


(-dx) + V(x 0 ,y 0 ){-dy) (1.71) 


dx dy 


dxdy. 


Dividing by dx dy , we have 

circulation per unit area = V x V| z . (1.72) 

The circulation 3 about our differential area in the xy-plane is given by the 
z-component of V x V. In principle, the curl, V x V at (x 0 ,y 0 ), could be 
determined by inserting a (differential) paddle wheel into the moving fluid at 
point (x 0 ,^ 0 ). The rotation of the little paddle wheel would be a measure of the 
curl. 

We shall use the result, Eq. 1.71, in Section 1.13 to derive Stokes’s theorem. 
Whenever the curl of a vector V vanishes, 


V x V = 0. (1.73) 

V is labeled irrotational. The most important physical examples of irrotational 
vectors are the gravitational and electrostatic forces. In each case 


V = 



(1.74) 


where C is a constant and r 0 is the unit vector in the outward radial direction. 
For the gravitational case we have C = — Gm l m 2 , given by Newton’s law of 
universal gravitation. If C = we have Coulomb’s law of electro- 

statics (mks units). The force V given in Eq. 1 .74 may be shown to be irrotational 
by direct expansion into cartesian components as we did in Example 1.8.1. 
Another approach is developed in Chapter 2, in which we express V x , the 
curl, in terms of spherical polar coordinates. In Section 1.13 we shall see that 
whenever a vector is irrotational, the vector may be written as the (negative) 
gradient of a scalar potential. In Section 1.15 we shall prove that a vector may 
be resolved into an irrotational part and a solenoidal part (subject to conditions 
at infinity). In terms of the electromagnetic field this corresponds to the resolu- 
tion into an irrotational electric field and a solenoidal magnetic field. 

For waves in an elastic medium, if the displacement u is irrotational, 
V x u = 0, planes waves (or spherical waves at large distances) become longitu- 
dinal. If u is solenoidal, V • u = 0, then the waves become transverse. A seismic 
disturbance will produce a displacement that may be resolved into a solenoidal 


3 In fluid dynamics V x V is called the “vorticity. 
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part and an irrotational part (compare Section 1.15). The irrotational part 
yields the longitudinal P (primary) earthquake waves. The solenoidal part 
gives rise to the slower transverse S (secondary) waves, Exercise 3.6.8. 

Using the gradient, divergence, and curl, and of course the BAC-CAB rule, 
we may construct or verify a large number of useful vector identities. For 
verification, complete expansion into cartesian components is always a possibil- 
ity. Sometimes if we use insight instead of routine shuffling of cartesian compo- 
nents, the verification process can be shortened drastically. 

Remember that V is a vector operator, a hybrid creature satisfying two sets 
of rules : 

1. vector rules, and 

2. partial differentiation rules — including differentia- 
tion of a product. 

EXAMPLE 1 .8.2. Gradient of a Dot Product 
Verify that 

V(A*B) == (B- V)A + (A* V)B + Bx (V x A) 4* A x (V x B). (1.75) 

This particular example hinges on the recognition that V(A*B) is the type of 
term that appears in the BAC-CAB expansion of a triple vector product, Eq. 
1.50. For instance, 

A x (V x B) = V(A'B) - (A- V)B, 

with the V differentiating only B, not A. From the commutativity of factors in 
a scalar product we may interchange A and B and write 

B x (V x A) = V(A-B) - (B • V) A, 

now with V differentiating only A, not B. Adding these two equations, we 
obtain V differentiating the product A - B and the identity, Eq. (1.75). 

This identity is used frequently in advanced electromagnetic theory. Exercise 
1.8.15 is one simple illustration. 


EXERCISES 

1 .8.1 Show, by rotating the coordinates, that the components of the curl of a vector 
transform as a vector. 

Hint . The direction cosine identities of Eq. 1.41 are available as needed. 

1.8.2 Show that u x vis solenoidal if u and v are each irrotational. 

1.8.3 If A is irrotational, show that A x r is solenoidal. 

1.8.4 A rigid body is rotating with constant angular velocity to. Show that the linear 
velocity v is solenoidal. 

1.8.5 A vector function f(x,y, z) is not irrotational but the product of f and a scalar 
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function g(x,y,z) is irrotational. Show that 

f-V x f =0. 

1.8.6 If (a) V = iV x (x,y) +}V y (x 9 y) and (b) V x V =f= 0, prove that V x V is per- 
pendicular to V. 

1.8.7 Classically, angular momentum is given by L = r x p, where p is the linear 
momentum. To go from classical mechanics to quantum mechanics, replace p 
by the operator —/V (Section 15.6). Show that the quantum mechanical angular 
momentum operator has cartesian components 



(in units of h). 

1 .8.8 Using the angular momentum operators previously given, show that they satisfy 
commutation relations of the form 

[L x , Lj = L x L y — L y L x = iL z 

and hence 

LxL = z'L. 

These commutation relations will be taken later as the defining relations as an 
angular momentum operator — Exercise 4 . 2.15 and the following one and Section 
12 . 7 . 

1.8.9 With the commutator bracket notation [L x , L y ] ~ L x L y — L y L x , the angular 
momentum vector L satisfies [L x , Lj = iL z , etc. and so on, or L x L = i L. 
Two other vectors a and b commute with each other and with L, that is, [a, b] = 
[a, L] = [b, L] = 0. Show that 

[a*L,b*L] = i(a x b)L. 

1.8.10 For A = L4 x (x, y , z) and B = i£ x (x, y, z ) evaluate each term in the vector identity 

V(A-B) = (B* V)A + (A* V)B + B x (V x A) + A x (V x B) 
and verify that the identity is satisfied. 

1 .8.1 1 Verify the vector identity 

V x (A x B) = (B • V) A - (A • V)B — B(V • A) + A(V-B). 


1 .8.1 2 As an alternative to the vector identity of Example 1 .8.2 show that 

V(A * B) = (A xV)xBl(BxV) xA + A(V • B) + B(V • A). 

1.8.13 Verify the identity 

A x (V x A) = %S!{A 2 ) ~ (A- V)A. 

1 .8.1 4 If A and B are constant vectors, show that 

V(A-B x r) = A x B. 
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1 .8.1 5 A distribution of electric currents creates a constant magnetic moment m. The 
force on m in an external magnetic induction B is given by 

F = V x (B x m). 

Show that 

F = V(m* B). 

Note. Assuming no time dependence of the fields, Maxwell’s equations yield 
V x B = 0. Also \ • B = 0. 


1.8.16 An electric dipole of moment p is located at the origin. The dipole creates an 
electric potential at r given by 
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Find the electric field, E = — at r. 


1.8.17 The vector potential A of a magnetic dipole, dipole moment m, is given by 
A(r) = (jU 0 /47i)(m x r/r 3 ). Show that the magnetic induction B = V x A is given 
by 

B ^go 3r 0 (r 0 «m) — m 
4n r 3 

Note. The limiting process leading to point dipoles is discussed in Section 12.1 
for electric dipoles, Section 12.5 for magnetic dipoles. 

1 .8.1 8 The velocity of a two-dimensional flow of liquid is given by 

V = i u(x,y) - jt’(x, y). 

If the liquid is incompressible and the flow is irrotational show that 

du __ dv du _ dv 

dx dy dy dx 

These are the Cauchy-Riemann conditions of Section 6.2. 


1.8.19 The evaluation in this section of the four integrals for the circulation omitted 
Taylor series terms such as dVJdx, dVJdy and all second derivatives. Show that 
dVJdx, dV y jdy cancel out when the four integrals are added and that the second 
derivative terms drop out in the limit as dx -* 0, dy 0. 

Hint. Calculate the circulation per unit area and then take the limit dx-* 0, 
dy - 0. 


1.9 SUCCESSIVE APPLICATIONS OF V 

We have now defined gradient, divergence, and curl to obtain vector, 
scalar, and vector quantities, respectively. Letting V operate on each of these 
quantities, we obtain 

(a) \-\cp (b) V x \cp (c) VV-V 

(d) V*V x V (e) V x (V x V), 

all five expressions involving second derivatives and all five appearing in the 
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second-order differential equations of mathematical physics, particularly in 
electromagnetic theory. 

The first expression, V*V<p, the divergence of the gradient, is named the 
Laplacian of (p. We have 


V • \(p 


± .± 
dx *dy 




, jV 

dy dz 


_ d 2 <? , 8 2 (p d 2 q> 

dx 2 dy 2 dz 2 ’ 


(1.76a) 


When q> is the electrostatic potential, we have 

V* \cp = 0. 


(1.76/?) 


which is Laplace’s equation of electrostatics. Often the combination V • V is 
written V 2 . 


EXAMPLE 1.9.1 


Calculate V • \g(r). 

Referring to Examples 1.6.1 and 1.7.2, 

?-Vgr(r) = VT 0 ^ 

_2 dg d 2 g 
r dr dr 2 ’ 


replacing /(r) in Example 1.7.2 by 1 jr • dg/dr. If g(r) = r", this reduces to 

V • Vr” = n(n + l)r" -2 . 


This vanishes for n — 0 \_g(r) = constant] and for n = — 1 ; that is, g(r) — 1 jr 
is a solution of Laplace’s equation, V 2 gr(r) = 0. This is for r ^ 0. At r = 0 a 
Dirac delta function is involved (see Eq. 1.173 and Section 8.7). 

Expression (b) may be written 


V x V<p = 


8_ 

dx 


dtp 

dx 


i 

d_ 

dy 

dtp 

dy 


k 

d_ 

dz 

dtp 

dz 


By expanding the determinant, we obtain 

d 2 tp 


V x Vtp = i 


= 0, 


'8 2 tp 

d 2 tp\ 

Jdydz 

dzdyj 


+ J 


dzdx 


gV \ 

dxdz) 


+ k 


<P 


dxdy 


syp } j 

dydxj 


(L77) 


assuming that the order of partial differentiation may be interchanged. This is 
true as long as these second partial derivatives of <p are continuous functions. 
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Then, from Eq. 1.77, the curl of a gradient is identically zero. All gradients, 
therefore, are irrotational. Note carefully that the zero in Eq. 1.77 comes as a 
mathematical identity, independent of any physics. The zero in Eq. 1.766 is a 
consequence of physics. 

Expression (d) is a triple scalar product which may be written 


V-V xV = 


AAA 

dx dy dz 

AAA 

dx dy dz 

K K v z 


(E78) 


Again, assuming continuity so that the order of differentiation is immaterial, 
we obtain 


VV x V = 0. (1.79) 

The divergence of a curl vanishes or all curls are solenoidal. In Section 1. 15 we 
shall see that vectors may be resolved into solenoidal and irrotational parts by 
Helmholtz’s theorem. 

The two remaining expressions satisfy a relation 

V x (V x V) = VV • V - V • VV. (1.80) 

This follows immediately from Eq. 1.50, the BAC-CAB rule, which we rewrite 
so that C appears at the extreme right of each term. The term V • VV was not 
included in our list, but it may be defined by Eq. 1.80. If V is expanded in 
cartesian coordinates so that the unit vectors are constant in direction as well 
as in magnitude, V • VV, a vector Laplacian, reduces to 

V-VV = iV-VF* + jV-VF y +-kV-VF z , 

a vector sum of ordinary scalar Laplacians. By expanding in cartesian coor- 
dinates, we may verify Eq. 1.80 as a vector identity. 


EXAMPLE 1.9.2 Electromagnetic Wave Equation 


One important application of this vector relation (Eq. 1.80) is in the deriva- 
tion of the electromagnetic wave equation. In vacuum Maxwell’s equations 
become 


V • B = 0, 

(1.81a) 

V • E = 0, 

(1.816) 

VxB= ¥ o | 

(1.81c) 

V x E = — — • 
dt 

(1.81c/) 


Here E is the electric field, B the magnetic induction, s 0 the electric permittivity, 
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and fi 0 the magnetic permeability (mks or SI units). Suppose we eliminate B 
from Eqs. 1.81c and 1.8 Id. We may do this by taking the curl of both sides of 
Eq. 1.8 Id and the time derivative of both sides of Eq. 1.81c. Since the space 
and time derivatives commute, 

|vxB = Vxf, ( 1 . 82 ) 

dt dt 

and we obtain 

Vx(VxE) = (1-83) 

Application of Eqs. 1.80 and of 1.816 yields 

V • VE = e oA*o^~=r> d-84) 

the electromagnetic vector wave equation. Again, if E is expressed in cartesian 
coordinates, Eq. 1.84 separates into three scalar wave equations, each involving 
a scalar Laplacian. 


EXERCISES 

1.9.1 Verify Eq. 1.80 

V x (V x V) = W V- VVV 
by direct expansion in cartesian coordinates. 

1.9.2 Show that the identity 

V x (V x V) = VV V - V VV 

follows from the BAC-CAB rule for a triple vector product. Justify any alteration 
of the order of factors in the BAC and CAB terms. 

1.9.3 Prove that V x (<pV<p) = 0. 

1.9.4 You are given that the curl of F equals the curl of G. Show that F and G may 
differ by (a) a constant and (b) a gradient of a scalar function. 

1.9.5 The Navier-Stokes equation of hydrodynamics contains a nonlinear term 
(v V)v. Show that the curl of this term may be written - V x [v x (V x v)]. 

1.9.6 From the Navier-Stokes equation for the steady flow of an incompressible 
viscous fluid we have the term 

V x [v x (V X v)] 

where v is the fluid velocity. Show that this term vanishes for the special case 

v = iv{y,z). 

Prove that (Vu) x (Vv) is solenoidal where u and v are differentiable scalar 
functions. 


1.9.7 
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1.9.8 cp is a scalar satisfying Laplace’s equation, \ 2 cp = 0. Show that \cp is both sole- 
noidal and irrotational. 

1.9.9 With if/ a scalar function, show that 

(r x V) • (r x V)f = r 2 V 2 \lj - r 2 ^\ - 2 r^-. 

dr dr 

(This can actually be shown more easily in spherical polar coordinates. Section 
2.5). 

1 .9.1 0 In a (nonrotating) isolated mass such as a star, the condition for equilibrium is 

\P + p\cp — 0. 

Here P is the total pressure, p the density, and cp the gravitational potential. 
Show that at any given point the normals to the surfaces of constant pressure 
and constant gravitational potential are parallel. 

1 .9.1 1 In the Pauli theory of the electron one encounters the expression 

(p - eA) x (p - eX)if/, 

where if/ is a scalar function. A is the magnetic vector potential related to the 
magnetic induction B by B = V x A. Given that p = — /V, show that this ex- 
pression reduces to ieBif/. 

1 .9.1 2 Show that any solution of the equation 

V x V x A — k 2 A = 0 

automatically satisfies the vector Helmholtz equation 

V 2 A + k 2 A = 0 

and the solenoidal condition 

V • A = 0. 

Hint. Let \ • operate on the first equation. 

1 .9.1 3 The theory of heat conduction leads to an equation 

V 2 T = £|V<D| 2 

where d> is a potential satisfying Laplace’s equation : V 2 <t> = 0. Show that a solu- 
tion of this equation is 


1.10 VECTOR INTEGRATION 

The next step after differentiating vectors is to integrate them. Let us start 
with line integrals and then proceed to surface and volume integrals. In each 
case the method of attack will be to reduce the vector integral to scalar integrals 
with which the reader is assumed familiar. 

Line Integrals 

Using an increment of length dr = idx -1- \dy + kdz, we may encounter the 
line integrals 
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j (pdr, 

f 

(1.85a) 

r* 

(1.85ft) 

V x dr. 

(1.85c) 


in each of which the integral is over some contour C that may be open (with 
starting point and ending point separated) or closed (forming a loop). Because 
of its physical interpretation that follows, the second form, Eq. 1.856 is by far 
the most important of the three. 

With cp , a scalar, the first integral reduces immediately to 


<pc/r = i J (p(x,y,z)dx + j j cp(x,y,z)dy 
+ k [ cp(x 9 y,z)dz. 

\(p dx = i I (p dx , 


This separation has employed the relation 


( 1 . 86 ) 


(1.87) 


which is permissible because the cartesian unit vectors i, j, and k are constant 
in both magnitude and direction. Perhaps this relation is obvious here, but it 
will not be true in the noncartesian systems encountered in Chapter 2. 

The three integrals on the right side of Eq. 1.86 are ordinary scalar integrals 
and, to avoid complications, we assume that they are Riemann integrals. Note, 
however, that the integral with respect to x cannot be evaluated unless y and z 
are known in terms of x and similarly for the integrals with respect to y and z. 
This simply means that the path of integration C must be specified. Unless the 
integrand has special properties that lead the integral to depend only on the 
value of the end points, the value will depend on the particular choice of 
contour C. For instance, if we choose the very special case q> — 1, Eq. 1.85a is 
just the vector distance from the start of contour C to the end point, in this 
case independent of the choice of path connecting fixed end points. With 
dx = idx + )dy + kdz, the second and third forms also reduce to scalar 
integrals and, like Eq. 1.85a, are dependent, in general, on the choice of path. 
The form (Eq. 1.856) is exactly the same as that encountered when we calculate 
the work done by a force that varies along the path, 


W — 


F-tfr 

F x (x,y,z)dx + 


F y (x 9 y, z) dy + F z (x, y 9 z) dz. 


(1.88a) 


In this expression F is the force exerted on a particle. 
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EXAMPLE 1.10.1 


The force exerted on a body is F = — iy + jx. The problem is to calculate 
the work done going from the origin to the point (1,1). 


'i,i ri,i 

W = I F-dr= (— ydx + xdy). 


0,0 


0,0 


Separating the two integrals, we obtain 


( 1 . 88 *) 


W = 


'i ri 

— ydx 4- xdy. 
Jo Jo 


(1.88c) 


The first integral cannot be evaluated until we specify the values of y as x 
ranges from 0 to 1. Likewise, the second integral requires x as a function of y. 
Consider first the path shown in Fig. 1.21. Then 


W — 


0 dx + 


irfy = 1> 


(1.88rf) 


since y = 0 along the first segment of the path and x = 1 along the second. 

If we select the path [x = 0,0 < y < 1] and [0 < x < l,y = 1], then Eq. 
1.88c gives W ~ — 1. For this force the work done depends on the choice of 
path. 



FIG. 1.21 A path of integration 


Surface Integrals 

Surface integrals appear in the same forms as line integrals, the element of 
area also being a vector, da} Often this area element is written n dA in which 
n is a unit (normal) vector to indicate the positive direction. 1 2 There are two 
conventions for choosing the positive direction. First, if the surface is a closed 
surface, we agree to take the outward normal as positive. Second, if the surface 


1 Recall that in Section 1 .4 the area (of a parallelogram) represented a cross- 
product vector . 

2 Although n always has unit length, its direction may well be a function of 
position. 
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n 

y 


FIG. 1.22 Right-hand rule for the 
positive normal 

is an open surface, the positive normal depends on the direction in which the 
perimeter of the open surface is traversed. If the right-hand fingers are placed 
in the direction of travel around the perimeter, the positive normal is indicated 
by the thumb of the right hand. As an illustration, a circle in the xy-plane 
(Fig. 1.22) mapped out from x to y to — x to — y and back to x will have its 
positive normal parallel to the positive z-axis (for the right-handed coordinate 
system). If readers should ever encounter one-sided surfaces, such as Moebius 
strips, it is suggested that they either cut the strips and form reasonable, well- 
behaved surfaces or label them pathological and send them to the nearest 
mathematics department. 

Analogous to the line integrals, Eqs. 1.85a, b , c, surface integrals may 
appear in the forms 

(pda 



V * da 


Jv X do. 

Again, the dot product is by far the most commonly encountered form. 

The surface integral j V * da may be interpreted as a flow or flux through the 
given surface. This is really what we did in Section 1 .7 to obtain the significance 
of the term divergence. This identification reappears in Section 1.11 as Gauss’s 
theorem. Note that both physically and from the dot product the tangential 
components of the velocity contribute nothing to the flow through the surface. 


Volume Integrals 

Volume integrals are somewhat simpler, for the volume element d% is a 
scalar quantity. 3 We have 


3 Frequently the symbols d 3 r and d 3 x are used to denote a volume element in 
x {xyz or space. 
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J Vrft = iJ V x dx + jj V y dx + k 


V 2 d r, 


(1.89) 


again reducing the vector integral to a vector sum of scalar integrals. 


Integral Definitions of Gradient, Divergence, 
and Curl 

One interesting and significant application of our surface and volume 
integrals is their use in developing alternate definitions of our differential 
relations. We find 

v<p= lim (1.90) 

Jdt-o J dx 


VV 


lim 


f \-da 

\dx 


V 


x V = 


lim 

Jdt-»0 


[da x V 

[dx 


(1.91) 

(1.92) 


In these three equations J dx is the volume of a small region of space and da 
is the vector area element of this volume. The identification of Eq. 1.91 as the 
divergence of V was carried out in Section 1.7. Here we show that Eq. 1.90 is 
consistent with our earlier definition of V<p (Eq. 1.53). For simplicity we choose 
\dx to be the differential volume dxdydz (Fig. 1.23). This time we place the 


z 



FIG. 1.23 Differential rectangular parallelepiped (origin at center) 


origin at the geometric center of our volume element. The area integral leads 
to six integrals, one for each of the six faces. Remembering that da is outward, 
da*i = — | da | for surface EFHG , and + \da\ for surface ABDC , we have 



EFHG 


<P 




ABDC 




8(p dx 
8x~2 


dydz 
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Using the first two terms of a Maclaurin expansion, we evaluate each integrand 
at the origin with a correction included to correct for the displacement (±dx/2) 
of the center of the face from the origin. 4 Having chosen the total volume to be 
of differential size (J dr = dxdydz ), we drop the integral signs on the right and 
obtain 

= f + + <>•*» 

Dividing by 

jdr = dxdydz , 

we verify Eq. 1.90. 

This verification has been oversimplified in ignoring other correction terms 
beyond the first derivatives. These additional terms, which are introduced in 
Section 5.6 when the Taylor expansion is developed, vanish in the limit 

Jdr — 0 (dx — 0 ,dy-+ 0, dz — 0). 

This, of course, is the reason for specifying in Eqs. 1.90, 1.91, and 1.92 that this 
limit be taken. 

Verification of Eq. 1.92 follows these same lines exactly, using a differential 
volume dxdydz. 


EXERCISES 


1 .1 0.1 The force field acting on a two-dimensional linear oscillator may be described by 

F = — ikx — jky. 

Compare the work done moving against this force field when going from (1,1) 
to (4, 4) by the following straight-line paths : 

(a) (1,1) -(4,1) -(4, 4) 

(b) (1,1) -(1,4) -(4, 4) 

(c) (1,1) — (4, 4) along x = y. 

This means evaluating 

[*( 4 , 4 ) 

F-dr 

J(i.D 

along each path. 


The origin has been placed at the geometric center. 
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1 .1 0.2 Find the work done going around a unit circle in the xy-plane : 

(a) counterclockwise from 0 to n, 

(b) clockwise from 0 to — n, 

doing work against a force field given by 


F = -^ + 


)X 


X 2 + y 2 X 2 + y 2 ' 

Note that the work done depends on the path. 

1 .10.3 Calculate the work you do in going from point (1, 1) to point (3,3). The force 
you exert is given by 

F = \(x - y) + j(x + y). 

Specify clearly the path you choose. Note that this force field is nonconservative. 

1.10.4 Evaluate §r-dr. 

Note. The symbol j> means that the path of integration is a closed loop. 

1.10.5 Evaluate 


r-da 


over the unit cube defined by the point (0,0,0) and the unit intercepts on the 
positive x-, y-, and z-axes. Note that (a) r-da is zero for three of the surfaces 
and (b) each of the three remaining surfaces contributes the same amount to 
the integral. 

1 .1 0.6 Show, by expansion of the surface integral, that 

v f s da x V _ 

lim = V x V. 

Jdt-0 J dx 

Hint. Choose the volume to be a differential volume, dxdydz. 


1.11 GAUSS'S THEOREM 

Here we derive a useful relation between a surface integral of a vector and 
the volume integral of the divergence of that vector. Let us assume that the 
vector V and its first derivatives are continuous over the region of interest. 
Then Gauss’s theorem states that 

| V-</<y=J \'\dx. (1.94a) 

In words, the surface integral of a vector over a closed surface equals the 
volume integral of the divergence of that vector integrated over the volume 
enclosed by the surface. 

Imagine that volume V is subdivided into an arbitrarily large number of 
tiny (differential) parallelepipeds. For each parallelepiped 

£ V • d<5 = V • V dx (1.94 b) 

six surfaces 

from the analysis of Section 1.7, Eq. 1.61, with p\ replaced by V. The summa- 
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FIG. 1.24 Exact cancellation of da's on interior 
surfaces. No cancellation on exterior surface. 


tion is over the six faces of the parallelepiped. Summing over all parallelepipeds, 
we find that the X -da terms cancel (pairwise) for all interior faces; only the 
contributions of the exterior surfaces survive (Fig. 1.24). Analogous to the 
definition of a Riemann integral as the limit of a sum, we take the limit as the 
number of parallelepipeds approaches infinity (-*• co) and the dimensions of 
each approach zero (->0). 

£ V • da = ]T V • V dz 

exterior surfaces volumes 

l 4 

J V • d<s = J V • V dx 

The result is Eq. 1.94a, Gauss’s theorem. 

From a physical point of view Eq. 1.61 has established V*V as the net 
outflow of fluid per unit volume. The volume integral then gives the total net 
outflow. But the surface integral J V • dxs is just another way of expressing this 
same quantity, which is the equality, Gauss’s theorem. 

GREEN'S THEOREM 

A frequently useful corollary of Gauss’s theorem is a relation known as 
Green’s theorem. If u and v are two scalar functions, we have the identities 

X-(uXv) = uX-Xv + (Xu)-(Xv), (1.95) 

V-(rVw) = vX-Xu + (Vu)-(Vn). (1.96) 

Subtracting Eq. 1.96 from Eq. 1.95, integrating over a volume (u, v, and their 
derivatives, assumed continuous), and applying Eq. 1.94 (Gauss’s theorem), 
we obtain 

(uV*V»- oV-Vu)rfr= [ (uXv- vXu)-da. (1.97) 

Jv Js 

This is Green’s theorem. We use it for developing Green’s functions, Chapters 
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8 and 16. An alternate form of Green’s theorem derived from Eq. 1.95 alone is 

Vu-Vvdr. (1.98) 


u\vdc = 

mV* Vvdx + 

JS J 

'V 


This is the form of Green’s theorem used in Section 1.15. 


ALTERNATE FORMS OF GAUSS'S THEOREM 

Although Eq. 1.94 involving the divergence is by far the most important 
form of Gauss’s theorem, volume integrals involving the gradient and the curl 
may also appear. Suppose 

V(x,y 9 z) = V(x,y,z) a, (1.99) 


in which a is a vector with constant magnitude and constant but arbitrary 
direction. (You pick the direction, but once you have chosen it, hold it fixed.) 
Equation 1.94 becomes 


a- V d<s — 


V •Wdx 


Jv 


by Eq. 1.62#. This may be rewritten 



Wdx 


Jv 


( 1 . 100 ) 


— /* 

-Js 


V da — 


Wdx = 


= 0 . 


( 1 . 101 ) 


Since |a| =/= 0 and its direction is arbitrary, meaning that the cosine of the 
included angle cannot always vanish, the term in brackets must be zero. 1 The 
result is 


Vdo= I Wdx. (1.102) 

Js Jv 

In a similar manner, using V = a x P in which a is a constant vector, we may 
show 


r r 

dts x P = V x PA. 

Js J V 


(1.103) 


These last two forms of Gauss’s theorem are used in the vector form of Kirchhoff 
diffraction theory. They may also be used to verify Eqs. 1.90 and 1.92. 

Gauss’s theorem may also be extended to dyadics or tensors (see Section 
3.5). 


1 This exploitation of the arbitrary nature of a part of a problem is a valuable 
and widely used technique. The arbitrary vector is used again in Sections 1.12 
and 1.13. Other examples appear in Section 1.14 (integrands equated) and in 
Section 3.3, quotient rule. 
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EXERCISES 


1.11.1 Using Gauss’s theorem prove that 


if S is a closed surface. 


dt y = 0, 


1.11.2 Show that 

3 J r-da= V, 

where V is the volume enclosed by the closed surface S. 

Note . This is a generalization of Exercise 1.10.5. 

1.11.3 If B = V x A, show that 

j B • <ftr = 0 

for any closed surface S . 

1.11.4 Over some volume V let ij/ be a solution of Laplace’s equation (with the deriva- 
tives appearing there continuous). Prove that the integral over any closed 
surface in V of the normal derivative of \jj , (dt ft/dti, or Vt jt * n) will be zero. 


1.11.5 


In analogy to the integral definitions of gradient, divergence, and curl of Section 
1.10, show that 


\ 2 cp = 


lim 

Jdf'+O 


j \ydt y 

t 


1.11.6 The electric displacement vector D satisfies the Maxwell equation V-D = p 
where p is the charge density (per unit volume). At the boundary between two 
media there is a surface charge density o (per unit area). Show that a boundary 
condition for D is 

(D 2 — Dj) -n = er. 

n is a unit vector normal to the surface and out of medium 1 . 

Hint. Consider a thin pillbox as shown in the figure. 



1 .11 .7 From Eq. 1.62a with V the electric field E, and / the electrostatic potential <p, 
show that 

pcpdz = e 0 j E 2 dr. 

This corresponds to a three-dimensional integration by parts. 

Hint. E = — \(p, V*E = p/e 0 . You may assume that (p vanishes at large r at 
least at fast as r" 1 . 

1.11.8 A particular steady-state electric current distribution is localized in space. 
Choosing a bounding surface far enough out so that the current density J is zero 
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everywhere on the surface, show that 

J dx = 0. 

Hint. Take one component of J at a time. With V • J = 0, show that J f = V • x t J 
and apply Gauss’s theorem. 


1 . 11.9 The creation of a localized system of steady electric currents (current density 
J) and magnetic fields may be shown to require an amount of work 




HB dx. 


Transform this into 

W=ijj-Adx. 

Here A is the magnetic vector potential : V x A = B. 

Hint. In Maxwell’s equations take the displacement current term dD/dt = 0. 
If the fields and currents are localized, a bounding surface may be taken far 
enough out so that the integrals of the fields and currents over the surface 
yield zero. 

1.11.10 Prove the generalization of Green’s theorem : 

J (v&u - u&v) dx = J — u\v) • da. 

Here is the self-adjoint operator (Section 9.1): 

if = V* [/?(r)V] + q(r) 

and p , q , w, and v are functions of position, p and q having continuous first 
derivatives and u and i? having continuous second derivatives. 

Note. This generalized Green’s theorem appears in Sections 8.7 and 16.6. 


1.12 STOKES'S THEOREM 

Gauss’s theorem relates the volume integral of a derivative of a function to 
an integral of the function over the closed surface bounding the volume. Here 
we consider an analogous relation between the surface integral of a derivative 
of a function and the line integral of the function, the path of integration being 
the perimeter bounding the surface. 

Let us take the surface and subdivide it into a network of arbitrarily small 
rectangles. In Section 1.8 we showed that the circulation about such a differen- 
tial rectangle (in the xy-plane) is V x V\ z dxdy. From Eq. 1.71 applied to one 
differential rectangle 

£ y = V x (1.104) 

four sides 

We sum over all the little rectangles as in the definition of a Riemann integral. 
The surface contributions (right-hand side of Eq. 1.104) are added together. 
The line integrals (left-hand side of Eq. 1 .104) of all interior line segments cancel 
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FIG. 1.25 Exact cancellation on 
interior paths. No cancellation on 
exterior path. 


identically. Only the line integral around the perimeter survives (Fig. 1.25). 
Taking the usual limit as the number of rectangles approaches infinity while 
dx ->0,dy-+ 0, we have 


I V 'd\ = 

exterior 
line segments 


C» 1 

= 


X VxV'tfff 

rectangles 

r 1 

V x Y-do. 

Js 


(1.105) 


This is Stokes’s theorem. The surface integral on the right is over the surface 
bounded by the perimeter or contour for the line integral on the left. 

This demonstration of Stokes’s theorem is limited by the fact that we used a 
Maclaurin expansion of Y(x,y,z) in establishing Eq. 1.71 in Section 1.8. 
Actually we need only demand that the curl of Y(x,y,z) exists and that it be 
integrable over the surface. A proof of the Cauchy integral theorem analogous 
to the development of Stokes’s theorem here but using these less restrictive 
conditions appears in Section 6.3. 

Stokes’s theorem obviously applies to an open surface. It is possible to con- 
sider a closed surface as a limiting case of an open surface with the opening (and 
therefore the perimeter) shrinking to zero. This is the point of Exercise 1.12.7. 


ALTERNATE FORMS OF STOKES'S THEOREM 

As with Gauss’s theorem, other relations between surface and line integrals 
are possible. We find 


di y x Y(p = d) (pdX 


(1.106) 


and 


(da x V) x P = T dX x P. 


(1.107) 


Js J 

Equation 1.106 may readily be verified by the substitution V = a<p in which a 
is a vector of constant magnitude and of constant direction, as in Section 1.11. 
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Substituting into Stokes’s theorem, Eq. 1.105, 


(V x sup) • da ■ 


a x 'Sep* da 


= —a* x da. 


For the line integral 


and we obtain 


secp*dX — a* kcpd'k. 


Sf^(pdk + ^ S(pxdo^J = 0. 


(1.108) 


(1.109) 


( 1 . 110 ) 


Since the choice of direction of a is arbitrary, the expression in parentheses 
must vanish, thus verifying Eq. 1.106. Equation 1.107 may be derived similarly 
by using V = a x P, in which a is again a constant vector. 

Both Stokes’s and Gauss’s theorems are of tremendous importance in a wide 
variety of problems involving vector calculus. Some idea of their power and 
versatility may be obtained from the exercises of Sections 1.11 and 1.12 and the 
development of potential theory in Sections 1.13 and 1.14. 


EXERCISES 

1 .12.1 Given a vector t = — i y 4- jx. With the help of Stokes’s theorem, show that 
the integral around a continuous closed curve in the xy-plane 

j j) t • dk = j j) (xdy ~ y dx) = A , 

the area enclosed by the curve. 

1.12.2 The calculation of the magnetic moment of a current loop leads to the line 
integral 

or x Jr. 

(a) Integrate around the perimeter of a current loop (in the xy-plane) and 
show that the scalar magnitude of this line integral is twice the area of 
the enclosed surface. 

(b) The perimeter of an ellipse is described by r = iacosO + j6sin0. From 
part (a) show that the area of the ellipse is nab . 

1.12.3 Evaluate <j>r x dr by using the alternate form of Stokes’s theorem given by 
Eq. 1.107: 

(da x V) x P = <p Jk x P. 

J s J 

Take the loop to be entirely in the xy-plane. 
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1 .1 2.4 In steady state the magnetic field H satisfies the Maxwell equation V x H = J, 
where J is the current density (per square meter). At the boundary between 
two media there is a surface current density K (per meter). Show that a boundary 
condition on H is 


n x (H 2 ~H!) = K. 

n is a unit vector normal to the surface and out of medium 1 . 

Hint. Consider a narrow loop perpendicular to the interface as shown in the 
figure. 


n 

medium 1 


medium 2 


1.12.5 From Maxwell’s equations, VxH = J with J here the current density and 
E = 0. Show from this that 

j>H-</r = / 

where I is the net electric current enclosed by the loop integral. These are the 
differential and integral forms of Ampere’s law of magnetism. 

1 .1 2.6 A magnetic induction B is generated by electric current in a ring of radius R. 
Show that the magnitude of the vector potential A (B = V x A) at the ring is 

1 2nR ’ 

where cp is the total magnetic flux passing through the ring. 

Note. A is tangential to the ring. 

1.12.7 Prove that 


J V x V-d<r = 0, 

if S is a closed surface. 

1 .1 2.8 Evaluate $r • dr (Exercise 1.10.4) by Stokes’s theorem. 

1.12.9 Prove that 


*dX 


j \vSu'd\. 


1.12.10 Prove that 


o u\V'dX 


J. 


(VhJ x (\v)-di r. 


1.13 POTENTIAL THEORY 

Scalar Potential 

If a force over a given region of space S can be expressed as the negative 
gradient of a scalar function <p, 
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F = — V<p, (1.111) 

we call (p a scalar potential. The force F appearing as the negative gradient of 
a single-valued scalar potential is labeled a conservative force. We want to know 
when a scalar potential function exists. To answer this question we establish 
two other relations as equivalent to Eq. 1.111. These are 

V x F = 0 (1.112) 

and 

j>F-<fr = 0, (1.113) 

for every closed path in our region S. We proceed to show that each of these 
three equations implies the other two. 

Let us start with 


Then 


F= -\cp. 


(1.114) 


V x F = — V x \(p = 0 (1.115) 

by Eq. 1.77 or Eq. 1.111 implies Eq. 1.112. Turning to the line integral, we have 



(1.116) 


using Eq. 1.56. Now dcp integrates to give cp. Since we have specified a closed 
loop, the end points coincide and we get zero for every closed path in our 
region S for which Eq. 1.111 holds. It is important to note the restriction here 
that the potential be single-valued and that Eq. 1.111 hold for all points in S. 
This problem may arise in using a scalar magnetic potential, a perfectly valid 
procedure as long as no net current is encircled. As soon as we choose a path 
in space that encircles a net current, the scalar magnetic potential ceases to be 
single-valued and our analysis no longer applies. 

Continuing this demonstration of equivalence, let us assume that Eq. 1.113 
holds. If §F-dr = 0 for all paths in S , we see that the value of the integral 
joining two distinct points A and B is independent of the path (Fig. 1.26). Our 
premise is that 


Therefore 


F-dr = 0. 


ACBDA 


(1.117) 


f F-dr=- 

F-dr = ( 

IACB 

BDA J 


F-dr, 


ADB 


(1.118) 
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FIG. 1.26 Possible paths for doing work 

reversing the sign by reversing the direction of integration. Physically, this 
means that the work done in going from A to B is independent of the path and 
that the work done in going around a closed path is zero. This is the reason for 
labeling such a force conservative: Energy is conserved. 

With the result shown in Eq. 1.118, we have the work done dependent only 
on the end points, A and B. That is, 

f B 

work done by force = F*dr = <p(A) — (1.119) 

Eq. 1.119 defines a scalar potential (strictly speaking, the difference in potential 
between points A and B) and provides a means of calculating the potential. If 
point B is taken as a variable, say, (x,y,z), then differentiation with respect to 
y, and z will recover Eq. 1.111. 

The choice of sign on the right-hand side is arbitrary. The choice here is made 
to achieve agreement with Eq. 1.111 and to ensure that water will run downhill 
rather than uphill. For points A and B separated by a length dr , Eq. 1.119 
becomes 


This may be rewritten 


F-dr = —d<p 

= —\cp • dr 


(F + \cp) •dr = 0, 

and since dr is arbitrary, Eq. 1.111 must follow. 
If 


( 1 . 120 ) 


( 1 . 121 ) 


F-dr = 0, 


( 1 . 122 ) 
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we may obtain Eq. 1.112 by using Stokes’s theorem (Eq. 1.109). 


<p F • dr — 


V x F*da. 


(1.123) 


If we take the path of integration to be the perimeter of an arbitrary differential 
area da, the integrand in the surface integral must vanish. Hence Eq. 1.113 
implies Eq. 1.114. 

Finally, if V x F = 0, we need only reverse our statement of Stokes’s 
theorem (Eq. 1.123) to derive Eq. 1.113. Then, by Eqs. 1.119 to 1.121, the 
initial statement F = — V<p is derived. The triple equivalence is demonstrated 
(Fig. 1.27). 


F = — V<p(l.l 1 1) 



FIG. 1.27 Equivalent formulations 

To summarize, a single- valued scalar potential function cp exists if and only 
if F is irrotational or the work done around every closed loop is zero. The 
gravitational and electrostatic force fields given by Eq. 1.75 are irrotational and 
therefore are conservative. Gravitational and electrostatic scalar potentials 
exist. Now, by calculating the work done (Eq. 1.1 19), we proceed to determine 
three potentials, Fig. 1.28. 

EXAMPLE 1.13.1 Gravitational Potential 


Find the scalar potential for the gravitational force on a unit mass m x , 

F G = radially inward. (1.124) 


By integrating Eq. 1.111 from infinity into position r, we obtain 


<Pg(t) - <Pg(°°) = 


F g • dr = + 


F G * dr. 


(1.125) 


By use of F c = — F applied , a comparison with Eq. 1.88 shows that the potential 
is the work done in bringing the unit mass in from infinity. (We can define only 
potential difference. Here we arbitrarily assign infinity to be a zero of potential.) 
The integral on the right-hand side of Eq. 1.125 is negative, meaning that cp G (r ) 
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is negative. Since F G is radial, we obtain a contribution to q> only when dr is 
radial or 



J r 


Gm x m 2 

r 


The final negative sign is a consequence of the attractive force of gravity. 


EXAMPLE 1.13.2 Centrifugal potential 


Calculate the scalar potential for the centrifugal force per unit mass, F c = 
co 2 rr 0 , radially outward . Physically, this might be you on a large horizontal 
spinning disk at an amusement park. Proceeding as in Example 1.13.1, but 
integrating from the origin outward and taking cp c ( 0) = 0, we have 


<Pc( r ) = 


Fc * dr = 


a) 2 r 2 


If we reverse signs, taking F SHO = — /or, we obtain (p s HO =jkr 2 , the simple 
harmonic oscillator potential. 


<P 



FIG. 1.28 Potential energy versus distance (gravitational, centrifugal, and simple 
harmonic oscillator) 

The gravitational, centrifugal, and simple harmonic oscillator potentials are 
shown in Fig. 1.28. Clearly, the simple harmonic oscillator yields stability and 



POTENTIAL THEORY 69 


describes a restoring force. The centrifugal potential describes an unstable 
situation. 


THERMODYNAMICS— EXACT 
DIFFERENTIALS 

In thermodynamics, which is sometimes called a search for exact differentials, 
we encounter equations of the form 

df = P(x, y) dx + Q(x,y)dy. (1.126) 

The usual problem is to determine whether J (P(x,y)dx + Q{x,y)dy) depends 
only on the end points, that is, whether df is indeed an exact differential. The 
necessary and sufficient condition is that 

df= f x dX + fy dy 

or that 

^>=g, 

Equations 1.1266 depend on the relation 

dP(x,y) = dQ(x,y) 

8y dx 


(1.126a) 


(1.1266) 


(1.126c) 


being satisfied. This, however, is exactly analogous to Eq. 1 . 1 1 6, the requirement 
that F be irrotational. Indeed, the z-component of Eq. 1.116 yields 


d J±JJy 

8y dx ’ 


(1.126c/) 


Vector Potential 

In some branches of physics, especially electromagnetic theory, it is con- 
venient to introduce a vector potential A, such that a (force) field B is given by 

B = V x A. (1.127) 

Clearly, if Eq. 1.127 holds, V • B = 0 by Eq. 1.79 and B is solenoidal. Here we 
want to develop a converse, to show that when B is solenoidal a vector potential 
A exists. We demonstrate the existence of A by actually calculating it. Suppose 
6 = 16!+ j 6 2 + k6 3 and our unknown A — \a l -f j a 2 + ka 3 . By Eq. 1.127 

da 2 da 2 » i<->o \ 

da x __ 603 _ 
dz dx 


(1.1286) 
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da2 fai = . 
dx dy 3 


(1.128c) 


Let us assume that the coordinates have been chosen so that A is parallel to the 
yz-p lane; that is, a x = 0. 1 Then 


(1.129) 


Integrating, we obtain 


b 3 dx +f 2 (y,z) 9 


b 2 dx +f 3 (y , z), 


(1.130) 


where f 2 and / 3 are arbitrary functions of y and z but are not functions of x. 
These two equations can be checked by differentiating and recovering Eq. 
1.129. Eq. 1.128a becomes 


da 3 _ da 2 
dy dz 


db 1 db , 


0/ 3 0/2 


dj + dz ) dy dz 


dx dy dz 5 

using V • B = 0. Integrating with respect to x, we obtain 

_ ^2 — A (v v 7) h (x y 2) 1 d/3 

dy dz~ bdX,y,2) bi(x 0 ,y,z)+ dy & . 


(1.131) 


(1.132) 


Remembering that / 3 and / 2 are arbitrary functions of y and z, we choose 

fz = 0, 

n (U33) 

f 3 = b 1 (x 0 ,y,z)dy. 


so that the right-hand side of Eq. 1.132 reduces to /?, (.v.v, r) in agreement with 
Equation 1.128a. With f 2 and / 3 given by Eq. 1.133, we can construct A. 

A = jf b 3 (x,y,z)dx + kf f b 1 (x 0 ,y,z)dy - f b 2 (x,y,z)dx\ . (1.134) 


This is not quite complete. We may add any constant since B is a derivative of 


1 Clearly, this can be done at any one point. It is not at all obvious that this 
assumption will hold at all points; that is, A will be two dimensional. The 
justification for the assumption is that it works; Eq. 1.134 satisfies Eq. 1 .127. 
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A. What is much more important, we may add any gradient of a scalar function 
V<p without affecting B at all. Finally, the functions f 2 and / 3 are not unique. 
Other choices could have been made. It will be seen in Section 1.15 that we may 
still specify V - A. 

EXAMPLE 1.13.3 A Magnetic Vector Potential for a Constant Magnetic 
Field 


To illustrate the construction of a magnetic vector potential, we take the 
special but still important case of a constant magnetic induction 


B = k B z9 


in which B z is a constant. Equation 1.128 becomes 

^ a 3 _ da 2 _ q 
8y 8z 

dfli _ 8a 3 _ ~ 

8z 8x 

d a i _ & a i _ d 
8x 8y 

If we assume that a 1 = 0, as before, then by Eq. 1.134 


A = j 


B 7 dx 


= J *B Z , 


(1.135) 


(1.136) 


(1.137) 


setting a constant of integration equal to zero. It can readily be seen that this A 
satisfies Eq. 1.127. 

To show that the choice a 1 = 0 was not sacred or at least not required, let us 
try setting a 3 = 0. From Eq. 1.136 


o' 

li 

N 

(1.138a) 

^ IP 
!! 

o 

(1.1386) 

8a 2 8a 1 „ 

8x 8y 

(1.138c) 

We see a 1 and a 2 are independent of z or 


a x = ai (x 9 y), a 2 =a 2 (x,y). 

(1.139) 

Equation 1.138c is satisfied if we take 


a 2 =p \ B z dx=pxB z 

(1.140) 
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and 


«i ={P~ 1) 


B z dy = (p- 1 )yB z , 


(1.141) 


with p any constant. Then 

A = i(p - 1 )yB z 4- j pxB z . (1.142) 

Again, Eqs. 1.127, 1.135, and 1.142 are seen to be consistent. Comparison of 
Eqs. 1.137 and 1.142 shows immediately that A is not unique. The difference 
between Eqs. 1.137 and 1.142 and the appearance of the parameter p in Eq. 
1.142 may be accounted for by rewriting Eq. 1.142 as 

A = -$(iy - j x)B z 

+ (P~ z>(i y + j x)B z (1.143) 

= ~l(iy - jx)B z + (p- j)B z V(p 

with 


<p = xy. (1.144) 

The first term in A corresponds to the usual form 

A = |(Bx r) (1.145) 

for B, a constant. 

To summarize this discussion of the vector potential, when a vector B is 
solenoidal, a vector potential A exists such that B = V x A. A is undetermined 
to within an additive gradient. This corresponds to the arbitrary zero of poten- 
tial, a constant of integration for the scalar potential. 

In many problems the magnetic vector potential A will be obtained from the 
current distribution that produces the magnetic induction B. This means solving 
Poisson’s (vector) equation (see Exercise 1.14.4). 


EXERCISES 

1 .1 3.1 If a force F is given by 

F = (x 2 4 y 2 + z 2 ) n ( ix 4 }y 4 kz), 

find 

(a) V-F, 

(b) V x F, 

(c) A scalar potential <p(x,y,z) so that F = — V<p. 

(d) For what value of the exponent n does the scalar potential diverge at both 
the origin and infinity? 

ANS. (a) (2n + 3)r 2 " (c) 1 — r 2n+2 , nj=-l 

2n 4 2 

(b) 0 (d) n — — 1, cp = — lnr. 
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1 . 1 3.2 A sphere of radius a is uniformly charged (throughout its volume). Construct 
the electrostatic potential cp(r) for 0 < r < go. 

Hint. In Section 1.14 it is shown that the Coulomb force on a test charge at 
r = r 0 depends only on the charge at distances less than r 0 and is independent 
of the charge at distances greater than r 0 . Note that this applies to a spherically 
symmetric charge distribution. 


1 .1 3.3 The usual problem in classical mechanics is to calculate the motion of a particle 
given the potential. For a uniform density (p 0 ), nonrotating massive sphere, 
Gauss’s law of Section 1.14 leads to a gravitational force on a unit mass m 0 
at a point r 0 produced by the attraction of the mass at r < r 0 . The mass at 
r > r 0 contributes nothing to the force. 

(a) Show that F jm 0 = — (4nGp 0 /3)r, 0 < r < a where a is the radius of the 
sphere. 

(b) Find the corresponding gravitational potential, 0 < r < a. 

(c) Imagine a vertical hole running completely through the center of the earth 
and out to the far side. Neglecting the rotation of the earth and assuming 
a uniform density p 0 = 5.5 gm/cm 3 , calculate the nature of the motion of 
a particle dropped into the hole. What is its period? 

Note. Far is actually a very poor approximation. Because of varying density, 
the approximation F = constant, along the outer half of a radial line and 
Far along the inner half is a much closer approximation. 


1 . 1 3.4 The origin of the cartesian coordinates is at the Earth’s center. The moon is 
on the z-axis, a fixed distance R away (center-to-center distance). The tidal 
force exerted by the moon on a particle at the earth’s surface (point x, y* z ) 
is given by 

F x = -GMmAr, F = F z = +GMm~. 

R 3 y R 3 R 3 


Find the potential that yields this tidal force. 
ANS. ~^~(z 2 - \x 2 - \y 2 ) 


In terms of the Legendre polynomials of Chapter 12 this becomes 


— GMm 


r 2 P 2 (cos 6). 


1 . 13.5 


A long straight wire carrying a current / produces a magnetic induction B with 
components 


p _ Rol ( zZ * 

2 n \x 2 + y 2 ’ x 2 + y 2 



Find a magnetic vector potential, A. 

ANS. A = — k(pQl/4n)\n(x 2 + y 2 ). 
(This solution is not unique.) 


1 . 13.6 If 



find a vector A such that V x A = B. One possible solution is 

A = i yz j xz 

r(x 2 4- y 2 ) r(x 2 + y 2 ) ' 
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1 .1 3.7 Show that the pair of equations 

A = i(B x r), 

B = V x A, 

is satisfied by any constant vector B (any orientation). 

1 .1 3.8 Vector B is formed by the product of two gradients 

B = (Vw) x (\v) 

where u and v are scalar functions. 

(a) Show that B is solenoidal. 

(b) Show that 

A = j(uVv — v\u) 
is a vector potential for B in that 

B = V x A. 

1.13.9 The magnetic induction B is related to the magnetic vector potential A by 
B = V x A. By Stokes’s theorem 

| B'do = j) A 'dr. 

Show that each side of this equation is invariant under the gauge transformation , 
A — ► A 4- \\jj . 

Note. Take the function \j/ to be single- valued. The complete gauge transforma- 
tion is considered in Exercise 3.7.4. 


1 .1 3.1 0 With E the electric field and A the magnetic vector potential, show that [E + 
dA jdt\ is irrotational and that therefore we may write 



1.13.11 


The total force on a charge q moving with velocity v is 

F = q(E + v x B). 


Using the scalar and vector potentials, show that 

F = ^-%-^ + V(A-v)j. 

Note that we now have a total time derivative of A in place of the partial deriva- 
tive of Ex. 1.13.10. 


1.14 GAUSS'S LAW, POISSON'S EQUATION 

Gauss's Law 

Consider a point electric charge q at the origin of our coordinate system. 
This produces an electric field E given by 1 


1 The electric field E is defined as the force per unit charge on a small stationary 

test charge. q t \ E = F jq t . From Coulomb’s law the force on q t due to q is 
F = (^/47ie 0 )(r 0 /r 2 ). When we divide by q t Eq. 1.146 follows. 
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E = -^4. 

4ne 0 r 

We now derive Gauss’s law which states that the surface integral 

<L 

E -da = \e 0 ' 

0 


(1.146) 


(1.147) 


q/e 0 if the closed surface S includes the origin (where q is located) and zero if the 
surface does not include the origin (Fig. 1.29). The surface S is any closed sur- 
face ; it need not be spherical. 

Using Gauss’s theorem, Eq. 1.94 (and neglecting the q/4ne 0 ), we obtain 


r 0 • t/cr 




(1.148) 


by Example 1 .7.2, provided the surface S does not include the origin, where the 
integrands are not defined. This proves the second part of Gauss’s law. 
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z 


To 



FIG. 1.30 Exclusion of the origin 


The first part, in which the surface S must include the origin, may be handled 
by surrounding the origin with a small sphere S' of radius <5 (Fig. 1.30). So that 
there will be no question what is inside and what is outside, imagine the volume 
outside the outer surface S and the volume inside surface S'(r < 5) connected 
by a small hole. This joins surfaces S and S', combining them into one single 
simply connected closed surface. Because the radius of the imaginary hole may 
be made vanishingly small, there is no additional contribution to the surface 
integral. The inner surface is deliberately chosen to be spherical so that we will 
be able to integrate over it. Gauss’s theorem now applies to the volume between 
S and S' without any difficulty. We have 


T -^P-+ f 5^-0. (U49) 

Js ' Js- d 


We may evaluate the second integral, for da' — — r 0 d 2 dQ, in which dQ is an 
element of solid angle. The minus sign appears because we agreed in Section 
1.10 to have the positive normal r' 0 outward from the volume. In this case the 
outward r' 0 is in the negative radial direction, r' 0 ~ — r 0 . By integrating over all 
angles, we have 


) • da' 


r 0 *r 0 (5 2 d£l __ 


= — An, 


(1.150) 


independent of the radius <5. With the constants from Eq. 1.146, this results in 


E*da 


j 

47l£ 0 


An — 


JL 


s 


(1.151) 
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completing the proof of Gauss’s law. Notice carefully that although the surface 
S may be spherical, it need not be spherical. 

Going just a bit further, we consider a distributed charge so that 


0 = 


pdz . 


(1.152) 


Equation 1.151 still applies, with q now interpreted as the total distributed 
charge enclosed by surface S. 

E -da= f £-dt. (1.153) 

Js Jv 8 ° 

Using Gauss’s theorem, we have 

V-E dx= | 2-dx. (1.154) 

Jv s ° 

Since our volume is completely arbitrary, the integrands must be equal or 



V*E = — , (1.155) 

£ 0 

one of Maxwell’s equations. If we reverse the argument, Gauss’s law follows 
immediately from Maxwell’s equation. 


Poisson's Equation 
Replacing E by — V<p, Eq. 1.155 becomes 


V • V<p = — — , (1.156) 

£o 

which is Poisson’s equation. For the condition p — 0 this reduces to an even 
more famous equation, 


= 0, (1.157) 

Laplace’s equation. We encounter Laplace’s equation frequently in discussing 
various coordinate systems (Chapter 2) and the special functions of mathe- 
matical physics which appear as its solutions. Poisson’s equation will be in- 
valuable in developing the theory of Green’s functions (Sections 8.7 and 16.5). 

From direct comparison of the Coulomb electrostatic force law and New- 
ton’s law of universal gravitation 


¥ P = 


. 0102 . 


4ne n 


F g = -G 


m i m 0 


All of the potential theory of this section applies equally well to gravitational 
potentials. For example, the gravitational Poisson equation is 


V • \cp = +4nGp 


(1.156a) 


with p now a mass density. 
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EXERCISES 


1 .1 4.1 Develop Gauss’s law for the two-dimensional case in which 


<p=-q 


lnp 

2ne 0 


E — 


-\cp = q 


PQ 

2t T£ 0 p ' 


Here q is the charge at the origin or the line charge per unit length if the two- 
dimensional system is a unit thickness slice of a three-dimensional (circular 
cylindrical) system. The variable p is measured radially outward from the line 
charge. p 0 is the corresponding unit vector (see Section 2.4). 


1 .1 4.2 (a) Show that Gauss’s law follows from Maxwell’s equation 

V*E = — . 

s 0 

Here p is the usual charge density. 

(b) Assuming that the electric field of a point charge q is spherically symmetric, 
show that Gauss’s law implies the Coulomb inverse square expression 

E = -£^. 


1 .1 4.3 Show that the value of the electrostatic potential <p at any point P is equal to the 
average of the potential over any spherical surface centered on P. There are no 
electric charges on or within the sphere. 

Hint. Use Green’s theorem, Eq. 1.97, with u~ l = r, the distance from P, and v = 
cp. Also note Eq. 1.173 in Section 1.15. 

1 .1 4.4 Using Maxwell’s equations, show that for a system (steady current) the magnetic 
vector potential A satisfies a vector Poisson equation, 

V 2 A = -pJ, 

provided we require V • A = 0. 


1.15 HELMHOLTZ'S THEOREM 


In Section 1.13 it was emphasized that the choice of a magnetic vector 
potential A was not unique. The divergence of A was still undetermined. In this 
section two theorems about the divergence and curl of a vector are developed. 
The first theorem is as follows. 

A vector is uniquely specified by giving its divergence and its curl within a 
region and its normal component over the boundary. 

Let us take 


\ •\ 1 ~ s 9 

VxV^c, 


(1.158) 


where ^ may be interpreted as a source (charge) density and c, as a circulation 
(current) density. Assuming also that the normal component V ln on the boun- 
dary is given, we want to show that V 1 is unique. We do this by assuming the 
existence of a second vector V 2 , which satisfies Eq. 1.158 and has the same 
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normal component over the boundary, and then showing that Vj — V 2 = 0. 
Let 

W = Vj - v 2 . 

Then 

V*W = 0 (1.159) 

and 


V x W = 0. 


(1.160) 


Since W is irrotational we may write (by Section 1.13) 

W= -\<p. 

Substituting this into Eq. 1.159, we obtain 

V-V<p = 0, 


(1.161) 


(1.162) 


Laplace’s equation. 

Now we draw upon Green’s theorem in the form given in Eq. 1.98, letting u 
and v each equal (p. Since 


K = v ln -v 2n = o 


on the boundary, Green’s theorem reduces to 


/% 

J V 


(V (p) * (V (p) dx 


W • W dr = 0. 


(1.163) 


(1.164) 


(1.165) 


The quantity W • W = W 2 is nonnegative and so we must have 

W - Vj - V 2 = 0 

everywhere. Thus V x is unique, proving the theorem. 

For our magnetic vector potential A the relation B = V x A specifies the 
curl of A. Often for convenience we set V • A = 0 (compare Exercise 1.14.4). 
Then (with boundary conditions) A is fixed. 

This theorem may be written as a uniqueness theorem for solutions of 
Laplace’s equation, Exercise 1.15.1. In this form, this uniqueness theorem is of 
great importance in solving electrostatic and other Laplace equation boundary 
value problems. If we can find a solution of Laplace’s equation that satisfies 
the necessary boundary conditions, then our solution is the complete solution. 
Such boundary value problems are taken up in Sections 12.3 and 12.5. 


Helmholtz's Theorem 

The second theorem we shall prove is Helmholtz’s theorem. 

A vector V satisfying Eq. 1.158 with both source and circulation densities 
vanishing at infinity may be written as the sum of two parts , one of which is 
irrotational , the other solenoidal . 

Helmholtz’s theorem will clearly be satisfied if we may write V as 

V = — \(p -f- V x A, 


(1.166) 



80 VECTOR ANALYSIS 


z 



— \cp being irrotational and V x A being solenoidal. We proceed to justify 
Eq. 1.166. 

V is a known vector. Taking the divergence and curl 

V • V = s(r) (1.166 a) 

\ x V = c(r), (1.166 b) 

with s{ r) and c(r) now known functions of position. From these two functions 
we construct a scalar potential <p( T t ), 

9(ri)= iJ n7 rfT2> (U67a) 


and a vector potential A(r 1 ), 


A(r x ) = 


4n 


c(r 2 ) 


dx~, 


’ 1 2 


(1.167 b) 


Here the argument r x indicates the field point; r 2 , the coordinates 

of the source point (x 2 ,y 2 ^ 2 )^ whereas 

r iz = [(*i - x z) 2 + t^i - >’ 2 ) 2 + (*i - z 2 ) 2 ] 1/2 - (1.168) 

When a direction is associated with r 12 , the positive direction is taken to be 
away from the source toward the field point. Vectorially, r 12 = r { — r 2 , as 
shown in Fig. 1.31. Of course, .s and c must vanish sufficiently rapidly at large 
distance so that the integrals exist. The actual expansion and evaluation of 
integrals such as Eqs. 1.167a and b is treated in Section 12.1 

From the uniqueness theorem at the beginning of this section, V is uniquely 
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specified by its divergence, s 9 and curl, c (and boundary conditions). Returning 
to Eq. 1.166, we have 

V-V = -V-V<p, (1.169a) 

the divergence of the curl vanishing and 

VxV = VxVxA, (1.1696) 

the curl of the gradient vanishing. If we can show that 

-\-V(p(r 1 ) = s(r 1 ) (1.169c) 

and 


V x V x A(r x ) = c(t x ) 9 


(1.169c/) 


then V as given in Eq. 1.166 will have the proper divergence and curl. Our 
description will be internally consistent and Eq. 1.166 justified. 1 
First, we consider the divergence of V : 


v* v = 

471 


s(r 2 ) 


dt 2 , 


(1.170) 


The Laplacian operator, V • V or V 2 , operates on the field coordinates (x 1 ,y 1 , z x ) 
and so commutes with the integration with respect to ( x 2 ,y 2 ,z 2 ). We have 


<u7i) 

From Example 1.6.1 and the development of Gauss’s law in Section 1.14, 
># t= - 

depending on whether the integration included the origin r = 0. This result 
may be conveniently expressed by introducing the Dirac delta function, S(r ), 2 


v-rvh r (i - i72) 


V 2 0)= -4nS(r). (1.173) 

This Dirac delta function is defined by its assigned properties 

S(r) = 0, rfO, (1.174a) 

jf(r)S(r)dr=f(0), (1.1746) 


where f(r) is any well-behaved function and the volume of integration includes 
the origin. As a special case of Eq. 1.1746, 


Alternatively, we could solve Eq. 1.169c, Poisson’s equation, and compare 
the solution with the constructed potential, Eq. 1.1 67a. The solution of 
Poisson’s equation is developed in Section 8.7. 

2 Compare Section 8.7 for a more extended treatment of the Dirac delta 
function. 
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J 5{r)dx = 1. (1.175) 

The quantity <5(r) is really not a function at all, since it is undefined (infinite) 
at r = 0. However, the crucial property, Eq. 1 .174/? can be developed rigorously 
as the limit of a sequence of functions, a distribution. This development appears 
in Section 8.7. Here we proceed to use the delta function in terms of its defining 
properties. 

We must make two minor modifications in Eq. 1.173 before applying it. 
First, our source is at r 2 , not at the origin. This means that the 4n in Gauss’s 
law appears if and only if the surface includes the point r = r 2 . To show this, 
we rewrite Eq. 1.173: 


V 2 



-4nd(r t — r 2 ). 


(1.176) 


This shift of the source to r 2 may be incorporated in the defining equations 
(1.174) as 

<5( r i — r 2 ) = 0, r^r 2 , (1.177a) 

jf (r^Oh - r 2 )dt 1 =/(r 2 ). (1.1776) 

Second, noting that differentiating rf 2 twice with respect to x 2 , y 2 , z 2 is the 
same as differentiating twice with respect to jc l9 y l9 z l9 we have 

= -4nS(r 2 - r 2 ). 

We could equally well have noted that from its defining properties 

<5(ri - r 2 ) = <5(r 2 - r t ). 

Rewriting Eq. 1.171 and using the Dirac delta function, Eq. 1.178, we may 
integrate to obtain 

T - v — 

= -^ n s(r 2 )(-4n)S(r 2 -r l )dz 2 (1.180) 

= -Sfrl)- 

The final step follows from Eq. 1.177/? with the subscripts 1 and 2 exchanged. 
Our result, Eq. 1.180, shows that the assumed form of V and of the scalar 
potential cp are in agreement with the given divergence (Eq. 1.16 6a). 

To complete the proof of Helmholtz’s theorem, we need to show that our 
assumptions are consistent with Eq. 1.166a, that is, the curl of V is equal to 
c(rj). From Eq. 1.166 


(1.178) 

(1.179) 
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VxV=Vx?xA 

= W A- V 2 A. 


(1.181) 


The first term, W • A leads to 

4nVX • A = f c(r 2 ) • Vj Vj ^ 2 — j dx 2 (1.182) 

by Eq. 1 . \61b. Again replacing the second derivatives with respect to x x , y x , z x 
by second derivatives with respect to x 2 , y 2 , z 2 ? we integrate each component 3 
of Eq. 1.182 by parts: 


471 VV • A 


J 


c(r 2 )-? 2 


_e_ 

dx 2 



dx 2 



dx 2 





(1.183) 

dx 2 . 


The second integral vanishes because the circulation density c is solenoidal. 4 
The first integral may be transformed to a surface integral by Gauss’s theorem. 
If c is bounded in space or vanishes faster than l/r for large r, so that the integral 
in Eq. 1 . 161 b exists, then by choosing a sufficiently large surface the first integral 
on the right-hand side of Eq. 1.183 also vanishes. 

With VV* A = 0, Eq. 1.181 now reduces to 


V x V = -V 2 A 


_L 

471 



dx 2 . 


(1.184) 


This is exactly like Eq. 1.171 except that the scalar s(r 2 ) is replaced by the vector 
circulation density c(r 2 ). Introducing the Dirac delta function, as before, as a 
convenient way of carrying out the integration, we find that Eq. 1.184 reduces 
to Eq. 1.158. We see that our assumed form of V, given by Eq. 1.166, and of 
the vector potential A, given by Eq. 1.167 b, are in agreement with Eq. 1.158 
specifying the curl of V. 

This completes the proof of Helmholtz’s theorem, showing that a vector 
may be resolved into irrotational and solenoidal parts. Applied to the electro- 
magnetic field, we have resolved our field vector V into an irrotational electric 
field E, derived from a scalar potential <p, and a solenoidal magnetic induction 
field B, derived from a vector potential A. The source density s(r) may be 
interpreted as an electric charge density (divided by electric permittivity s), 
whereas the circulation density c(r) becomes electric current density (times 
magnetic permeability }i). 


EXERCISES 

1.15.1 Implicit in this section is a proof that a function \j/{ r) is uniquely specified by 
requiring it to (1) satisfy Laplace’s equation and (2) satisfy a complete set of 
boundary conditions. Develop this proof explicitly. 


3 This avoids creating the tensor c(r 2 )V 2 . 

4 Remember c = V x V is known. 
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1 .1 5.2 (a) Assuming that P is a solution of the vector Poisson equation, ViP(rj) = 
— VO*!), develop an alternate proof of Helmholtz’s theorem, showing that 
V may be written as 

V = —\(p + V x A, 


where 


and 


A = V x P, 


(p = VP 

(b) Solving the vector Poisson equation, we find 


Pfri) = 


it 


V(r 2 ) 




Show that this solution substituted into <p and A of part (a) leads to the 
expressions given for q> and A in Section 1.15. 
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2 COORDINATE 
SYSTEMS 


In Chapter 1 we restricted ourselves almost completely to cartesian coor- 
dinate systems. A cartesian coordinate system offers the unique advantage that 
all three unit vectors, i, j, and k, are constant in direction as well as in magnitude. 
We did introduce the radial distance r but even this was treated as a function 
of x, y, and z. Unfortunately, not all physical problems are well adapted to 
solution in cartesian coordinates. For instance, if we have a central force prob- 
lem, F = r 0 F(r), such as gravitational or electrostatic force, cartesian coor- 
dinates may be unusually inappropriate. Such a problem literally screams for 
the use of a coordinate system in which the radial distance is taken to be one 
of the coordinates, that is, spherical polar coordinates. 

The point is that the coordinate system should be chosen to fit the problem, 
to exploit any constraint or symmetry present in it. Then, hopefully, it will be 
more readily soluble than if we had forced it into a cartesian framework. Quite 
often “more readily soluble” will mean that we have a partial differential equa- 
tion that can be split into separate ordinary differential equations, often in 
“standard form” in the new coordinate system. This technique, the separation 
of variables, is discussed in Section 2.6. 

We are primarily interested in coordinates in which the equation 

VV + *V = 0 (2.1) 

is separable. Equation 2.1 is much more general than it may appear. If 

k 2 = 0 Eq. 2.1 -> Laplace’s equation, 

k 2 = (-F) constant Helmholtz’s equation, 

k 2 = ( — ) constant Diffusion equation (space part), 

k 2 — constant x kinetic energy Schrodinger wave equation. 

It has been shown [L. P. Eisenhart, Phys. Rev. 45, 427 (1934)] that there are 1 1 
coordinate systems in which Eq. 2.1 is separable, all of which can be considered 
particular cases of the confocal ellipsoidal system. 

Naturally, there is a price that must be paid for the use of a noncartesian 
coordinate system. We have not yet written expressions for gradient, divergence, 
or curl in any of the noncartesian coordinate systems. Such expressions are 
developed in very general form in Section 2.2. First, we must develop a system 
of curvilinear coordinates, a general system that may be specialized to any of 
the particular systems of interest. We shall specialize to circular cylindrical 
coordinates in Section 2.4 and to spherical polar coordinates in Section 2.5. 
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2.1 CURVILINEAR COORDINATES 

In cartesian coordinates we deal with three mutually perpendicular families 
of planes : x = constant, y = constant, and z = constant. Imagine that we super- 
impose on this system three other families of surfaces. The surfaces of any one 
family need not be parallel to each other and they need not be planes. If this is 
difficult to visualize, the figure of a specific coordinate system such as Fig. 2.3 
may be helpful. The three new families of surfaces need not be mutually perpen- 
dicular, but for simplicity we quickly impose this condition (Eq. 2.7). We may 
describe any point (x, y, z) as the intersection of three planes in cartesian co- 
ordinates or as the intersection of the three surfaces that form our new, curvi- 
linear coordinates. Describing the curvilinear coordinate surfaces by q x = 
constant, q 2 = constant, q 3 = constant, we may identify our point by (q t , q 2 , q$) 
as well as by (x,y 9 z). This means that in principle we may write 

General curvilinear coordinates Circular cylindrical coordinates 

p,<p,z 

x = x(q l9 q 2 ,q 3 ) x = pcos(p 

y = y{<h,<h,<h) y = psm<p (2.2) 

z = z(q i ,q 2 ,q 2l ) z = z 

specifying x, y , z in terms of the q ’ s and the inverse relations, 

<h=<h(x,y,z) p = (x 2 +y 2 ) 112 

q 2 = q 2 (x,y, z) <p = arctan(j/x) (2.3) 

<73 = 93 (x,y,z) z = z 

As a specific illustration of the general, abstract q x ,q 2 , the transformation 
equations for circular cylindrical coordinates (Section 2.4) are included in Eqs. 

2.2 and 2.3. With each family of surfaces q t = constant, we can associate a unit 
vector e ( normal to the surface q t = constant and in the direction of increasing 
q { . Then a vector V may be written 

V = e l V t + e 2 V 2 + e 3 V 3 . 

Differentiation of x in Eq. 2.2 leads to 

dx = ~~-~dq j + dq $ . (2.4) 

Sq i dq 2 dq 3 

and similarly for differentiation of y and z. From the Pythagorean theorem in 
cartesian coordinates the square of the distance between two neighboring 
points is 

ds 2 — dx 2 + dy 2 + dz 2 . (2.4a) 

We assume that in our curvilinear coordinate space the square of the distance 
element can be written as a general quadratic form : 
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ds 1 2 =g ll dq 2 i +g 12 dq 1 dq 2 +g 13 dq 1 dq i 

+ g 21 dq 2 dq l + g 22 dq 2 + g 23 dq 2 dq 3 
+ 9 3 i dq 3 dq , + g 32 dq 3 dq 2 + g 33 dq\ 
= 'Ldijdq i dq j . 


Spaces for which Eq. 2.5 is a legitimate expression are called metric or Rieman- 
nian. Substituting Eq. 2.4 (squared) and the corresponding results for dy 2 and 
dz 2 into Eq. 2.4a and equating coefficients of dq^dq^ we find 


9 11 dq l dq J dqtdqj dq t dq ] 


( 2 . 6 ) 


These coefficients g ij9 which we now proceed to investigate, may be viewed as 
specifying the nature of the coordinate system (tfi,# 2 ,< 7 3 ). Collectively these 
coefficients are referred to as the metric and in Section 3.3 will be shown to be 
a second-rank tensor. 2 In general relativity the metric components are deter- 
mined by the properties of matter. Geometry is merged with physics. 

At this point we limit ourselves to orthogonal (mutually perpendicular sur- 
faces) coordinate systems, which means (see Exercise 2.1.1) 3 


= i±j. 


(2.7) 


(Nonorthogonal coordinate systems are considered in some detail in Sections 
3.8 and 3.9 in the framework of tensor analysis and in Section 4.4 by using 
matrix analysis.) Now, to simplify the notation, we write g u = hf so that 


ds 2 = dq t ) 2 + (h 2 dq 2 ) 2 + (h 3 dq 3 ) 2 . (2.8) 


The specific coordinate systems are described in subsequent sections by specify- 
ing these scale factors h 1 ,h 2 , and h 3 . Conversely, the scale factors may be con- 
veniently identified by the relation 

ds. = h i dq i (2.9) 

for any given dq u holding the other q\ constant. Note that the three curvilinear 
coordinates q l 9 q 29 need not be lengths. The scale factors h t may depend on 
the q's and they may have dimensions. The product h^dqi must have dimensions 
of length. The differential distance vector dr may be written 

dr = h 1 dq x e 1 + h 2 dq 2 t 2 + h 3 dq 3 e^ 

= E h i d qi*i- 


1 The dq * s are arbitrary. For instance, setting dq 2 = dq 3 = 0 isolates g xl . It 
might be noted that Eq. 2.6 can be derived from Eq. 2.4 more elegantly with 
the matrix notation of Chapter 4. Further, the matrix notation leads directly 
to the Jacobian determinant, Exercise 2. 1 .5. ' 

2 The tensor nature of the set of g^s follows from the quotient rule (Section 
3.3). Then the tensor transformation law yields Eq. 2.6. 

3 In relativistic cosmology the nondiagonal elements of the metric g tj are 
usually set equal to zero as a consequence of the physical assumptions of no 
rotation and no shear strains (see also Section 3.6). 
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Using this curvilinear component form, we find that a line integral becomes 


v-* = IJ vMt 

From Eq. 2.9 we may immediately develop the area and volume elements 


and 


da (j ~ dSidsj ~ h i h j dq t dqj 


( 2 . 10 ) 


dx = ds x ds 2 ds 3 = h 1 h 2 h 3 dq 1 dq 2 dq 3 . (2.11) 

The expressions in Eqs. 2.10 and 2.1 1 agree, of course, with the results of using 
the transformation equations, Eq. 2.2, and Jacobians. 

From Eq. 2.10 an area element may be expanded : 


da — ds 2 ds 3 + ds 3 ds x e 2 + ds 1 ds 2 e 3 
= h 2 h 3 dq 2 dq 3 e x + h 3 h l dq 3 dq t e 2 
+ h 1 h 2 dq 1 dq 2 e 3 


A surface integral becomes 


I 


\-d < i = 


V\h 2 h 3 dq 2 dq 3 


V 2 h 3 h l dq 3 dq x 




V 3 h 1 h 2 dq l dq 2 . 


J 


Examples of such line and surface integrals appear in Sections 2.4 and 2.5. 

In anticipation of the new forms of equations for vector calculus that appear 
in the next section, the student should clearly understand that vector algebra is 
the same in orthogonal curvilinear coordinates as in cartesian coordinates. 
Specifically, for the dot product 


A*B = A X B X 4- A 2 B 2 + A 3 B 3 , (2.11a) 

where the subscripts indicate curvilinear components. For the cross product 


just like Eq. 1.35. 


A x B = 


e i 

A, 

Bi 


e 2 e 3 


a 2 a 3 , 

^3 


(2. life) 


EXERCISES 

2 . 1.1 Show that limiting our attention to orthogonal coordinate systems implies that 
dij — 0 for / =bj (Eq. 2.7). 

Hint. Construct a triangle with sides ds u ds 2 , and ds. Equation 2.9 must hold 
regardless of whether g Vj — 0. Then compare ds 2 fr om Eq . 2.5 with a calculation 
using the law of cosines. Show that cos 0 12 = QuIsJGwGii' 
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2.1.2 In the spherical polar coordinate system q x = r,q 2 = 6 , q 3 = cp. The transformation 
equations corresponding to Eq. 2.2 are 

x — rsin 8 cos cp 

y = r sin 0 sin cp 

z = rcosO. 

(a) Calculate the spherical polar coordinate scale factors: h r , h 6 , and h (p . 

(b) Check your calculated scale factors by the relation ds i — h^q^ 

2.1 .3 The t?-, z-coordinate system frequently used in electrostatics and in hydrody- 
namics is defined by 


xy = u, 
x 2 — y 2 = v , 
z — z. 


This u - , v-, z-system is orthogonal. 

(a) In words, describe briefly the nature of each of the three families of coordinate 
surfaces. 

(b) Sketch the system in the xy-plane showing the intersections of surfaces of 
constant u and surfaces of constant v with the xy-plane. 

(c) Indicate the directions of the unit vector u 0 and v 0 in ail four quadrants. 

(d) Finally, is this w-, i>, z-system right-handed (u 0 x v 0 = +k) or left-handed 
(u 0 x v 0 = — k)? 


2.1 .4 The elliptic cylindrical coordinate system consists of three families of surfaces: 


a 2 cosh 2 u a 2 sinh 2 u 


a 2 cos 2 v a 2 sin 2 v 
3. z = z 

Sketch the coordinate surfaces u — constant and v = constant as they interest 
the first quadrant of the .r^-plane. Show the unit vectors u 0 and v 0 . The range of 
u is 0 < u < co. The range of v is 0 < v < 2n. 


2.1 .5 A /wodimensional orthogonal system is described by the coordinates q x and q 2 . 
Show that the Jacobian 



is in agreement with Eq. 2. 10. 

Hint. It’s easier to work with the square of each side of this equation. 


2.1 .6 In Minkowski space we define x 1 = x, x 2 = y, x 3 — z, and x 4 = ict. This is done 
so that the space-time interval ds 2 = dx 2 + dy 2 + dz 2 — c 2 dt 2 (c — velocity of 
light) becomes ds 2 = £f =1 dx 2 . Show that the metric in Minkowski space is = <5 tj 
or 


(9ij) = 
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This indicates the advantage of using Minkowski space in a special relativity 
theory: It is a four-dimensional cartesian system. We use Minkowski space in 
Sections 3.7 and 4.12 for describing Lorentz transformations. 


2.2 DIFFERENTIAL VECTOR OPERATIONS 


Gradient 

The starting point for developing the gradient, divergence, and curl operators 
in curvilinear coordinates is our interpretation of the gradient as the vector 
having the magnitude and direction of the maximum space rate of change (com- 
pare Section 1.6). From this interpretation the component of \\p(q x ,q 2 ,q 2 ) in 
the direction normal to the family of surfaces q t = constant is given by 1 


Vli ,i = # = d JL 

V ^ !l 8s, h,8q,' 


( 2 . 12 ) 


since this is the rate of change of ip for varying q x , holding q 2 and q 3 fixed. The 
quantity ds x is a differential length in the direction of increasing q x (compare 
Eq. 2.9). In Section 2. 1 we introduced a unit vector e x to indicate this direction. 
By repeating Eq. 2.12 for q 2 and again for q 3 and adding vector ially, we see 
that the gradient becomes 


v? / / ^ 3^ , d\p . dip 

W(<7 1 ,* 2 ,* 3 ) = e 1 - + e 2 - + e 3 - 

= ei — ^ — b e 2 — — ^ — f e 3 -^— 
h,8q, h 2 dq 2 h 3 dq 3 


(2.13) 


Exercise 2.2.4 offers a mathematical alternative independent of this physical 
interpretation of the gradient. 


Divergence 

The divergence operator may be obtained from the second definition (Eq. 
1.91) of Chapter 1 or equivalently from Gauss’s theorem, Section 1.11. Let 
us use Eq. 1.91 : 

f V-c/<y 

V-y(? l9 «2,?3)= lim (2.14) 

dt^O J aT 

with a differential volume h x h 2 h 3 dq x dq 2 dq 3 (Fig. 2.1). Note that the positive 
directions have been chosen so that {q x ,q 2 , q 3 ) or (e x , e 2 , e 3 ) form a right-handed 
set, e x x e 2 = e 3 . 

The area integral for the two faces q x = constant is given by 


1 Here the use of <p to label a function is avoided because it is conventional to 
use this symbol to denote an azimuthal coordinate. 
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z 



Vih 2 h 3 + p — (Vih 2 h 3 )dq l 

cq i 


dq 2 dq 3 - V 1 h 2 h 3 dq 2 dq 3 


(2.15) 


= -^r( V i h 2h3)dq 1 dq 2 dq 3 , 

dq x 

exactly as in Sections 1.7 and 1.10 . 2 Adding in the similar results for the other 
two pairs of surfaces, we obtain 


V{qi,qi,q 3 )‘do 

-[4: <w 


dq 2 c<h 


i^2)J dq x 


(2.16) 


dq 2 dq 3 ■ 


Division by our differential volume (Eq. 2.14) yields 


V -V(q' 1 ,4 r 2>?3) 


h 1 h 2 h 3 


[?q\ 


{V x h 2 h 3 ) + ~-~{V 2 h 3 h x ) + j—(V 3 h l h 2 ) 


dq 2 


dq 3 


(2.17) 

In Eq. 2.17 is the component of V in the e r direction, increasing q { \ that is, 
fi = e t * V. 

We may obtain the Laplacian by combining Eqs. 2.13 and 2.17, using 
V = V xjj (q x , q 2 j q^)' This leads to 


2 Since we take the limit dq lt dq 2 , dq 3 -> 0, the second- and higher-order 
derivatives will drop out. 
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V*Vi p(qi,q 2 ,g 3 ) 


1 

h\h 2 h 3 


?<h 


'h 2 h 3 #\ 
, hi dqj 


, 8 /Ml , g M g| A 

dq 2 \ h 2 Sq 2 ) dq 3 \ ft 3 dtfj 


(2.18a) 


Curl 

Finally, to develop V x V, let us apply Stokes’s theorem (Section 1.12) and, 
as with the divergence, take the limit as the surface area becomes vanishingly 
small. Working on one component at a time, we consider a differential surface 
element in the curvilinear surface q t = constant. From 


J V X = V x \\ih 2 h 3 dq 2 dq 3 (2.18ft) 

(mean value theorem of integral calculus) Stokes’s theorem yields 

V x V\ l h 2 h 3 dq 2 dq 3 = j> \-di, (2.19) 

with the line integral lying in the surface q l = constant. Following the loop 
(1,2, 3, 4) of Fig. 2.2, 


V(?i,? 2 >9 , 3)- fi?r = V 2 h 2 dq 2 + 


Sq 2 




dq 3 


V 2 h 2 + ~dV 2 h 2 )dq 3 j dq 2 - V 3 h 3 dq 3 (2.20) 


dq. 


<h 3 V 3 ) 




dq 2 dq 3 . 


Z 



FIG. 2.2 Curvilinear surface element 

We pick up a positive sign when going in the positive direction on parts 1 and 2 
and a negative sign on parts 3 and 4 because here we are going in the negative 
direction. Higher-order terms in Maclaurin or Taylor expansion have been 
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omitted. They will vanish in the limit as the surface becomes vanishingly small 
(dq 2 -> 0, dq 3 -* 0). 

From Eq. 2.19 


Vx V| x = 


h 2 h 3 


dq 


<h 3 V 3 ) 


8q 


<h 2 V 2 ) 


( 2 . 21 ) 


The remaining two components of V x V may be picked up by cyclic permuta- 
tion of the indices. As in Chapter 1, it is often convenient to write the curl in 
determinant form : 



e 1 /i 1 


e 3^3 

1 


_5_ 


h t h 2 h 3 

dq i 

dq 2 

dq 3 


hyVy 

h 2 V 2 

h 3 V 3 


( 2 . 22 ) 


Remember that because of the presence of the differential operators, this 
determinant must be expanded from the top down. Note that this equation is 
not identical with the form for the cross product of two vectors, Eq. 2.1 16. 
V is not an ordinary vector; it is a vector operator. 

Our geometric interpretation of the gradient and the use of Gauss’s and 
Stokes’s theorems (or integral definitions of divergence and curl) have enabled 
us to obtain these quantitites without having to differentiate the unit vectors e t . 
There exist alternate ways to determine grad, div, and curl based on direct 
differentiation of the e f . One approach resolves the e t of a specific coordinate 
system into its cartesian components (Exercises 2.4.1 and 2.5.1) and differenti- 
ates this cartesian form (Exercises 2.4.3 and 2.5.2). The point here is that the 
derivatives of the cartesian i, j, and k vanish since i, j, and k are constant in 
direction as well as in magnitude. A second approach [L. J. Kijewski, Am. J. 
Phys. 33, 816 (1965)] assumes the equality of d 2 rjdq i 8q j and d 2 rjdq j dq i and 
develops the derivatives of e f in a general curvilinear form. Exercises 2.2.3 and 
2.2.4 are based on this method. 


EXERCISES 


2.2.1 Develop arguments to show that ordinary dot and cross products (not involving 
V) in orthogonal curvilinear coordinates proceed as in cartesian coordinates with 
no involvement of scale factors. 


2.2.2 


With a unit vector in the direction of increasing q u show that 
1 d(h 2 h 3 ) 


(a) V*e t = 


(b) V x c! 


hih 2 h 3 

■it 


&<h 

Shy 


2 h 3 8q 3 h 2 dq 


Shy ' 
h 2 dq 2 _ ' 


Note that even though e x is a unit vector, its divergence and curl do not necessarily 
vanish. 
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2.2.3 Show that the orthogonal unit vectors e t may be defined by 


J_ dr_ 

hi dq t ' 


{a) 


In particular, show that e i *e i = 1 leads to an expression for h { in agreement with 
Eq. 2.6. 

Eq. (a) may be taken as a starting point for deriving 


and 


2.2.4 Derive 


dhj 

8qj 6j hidq, ’ 


i+j 


fei = y c Shj 

Sq t & ’hfiqi 


ViA = e, 


dip 

dq\ 


+ e 2 


h 2 dq 2 


+ e 3 


dip 

h 3 dq 3 


by direct application of Eq. 1.90, 


Vi p = 


lim ^ f-j - . 
J dr 


Hint . Evaluation of the surface integral will lead to terms like {h l h 2 h^y i {dldq x ) 
(e 1 h 2 h 3 ). The results listed in Ex. 2.2.3 will be helpful. Cancellation of unwanted 
terms occurs when the contributions of all three pairs of surfaces are added to- 
gether. 


2.3 SPECIAL COORDINATE SYSTEMS— 

RECTANGULAR CARTESIAN COORDINATES 

As mentioned in Section 2.1, there are 11 coordinate systems in which the 
three-dimensional Helmholtz equation can be separated into three ordinary 
differential equations. Some of these coordinate systems have achieved promi- 
nence in the historical development of quantum mechanics. Other systems such 
as bipolar coordinates, satisfy special needs. Partly because the needs are rather 
infrequent, but mostly because the development of high-speed computing 
machines and efficient programming techniques reduces the need for these 
coordinate systems, the discussion in this chapter is limited to (1) cartesian 
coordinates, (2) spherical polar coordinates, and (3) circular cylindrical co- 
ordinates. Specifications and details of the other coordinate systems will be 
found in the first two editions of this work and in the references (Morse and 
Feshbach, Margenau and Murphy). 

Rectangular Cartesian Coordinates 

These are the cartesian coordinates on which Chapter 1 is based. In this 
simplest of all systems 


h 2 = h y = 1, 
h 3 =h z = 1. 


(2.23) 
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The families of coordinate surfaces are three sets of parallel planes: x = con- 
stant,^ = constant, and z = constant. The cartesian coordinate system is unique 
in that all its h ? s are constant. This will be a significant advantage in treating 
tensors in Chapter 3. Note also that the unit vectors, e t , e 2 , e 3 or i, j, k, have 
fixed directions. 

From Eqs. 2. 13, 2.17, 2. 18, and 2.22 we reproduce the results of Chapter 1, 


, . 8ij/ . dij/ . dip 


_ sv, . dV, 
dx dy 


i + lK 


dz ’ 


w dx 2 + ay 2 + dz 2 ’ 


v x y = 


i J k 

AAA 

dx dy dz 
V x V y V 2 


(2.24) 

(2.25) 

(2.26) 

(2.27) 


2.4 CIRCULAR CYLINDRICAL COORDINATES (p, q>, z) 


In the circular cylindrical coordinate system the three curvilinear coordinates 

(<7 >^ 3 ) are relabeled ( p,<p,z ). The coordinate surfaces, shown in Fig. 2.3, 
are 

1 . Right circular cylinders having the z-axis as a com- 
mon axis, 

p = (x 2 + y 2 ) 112 = constant. 

2. Half planes through the z-axis. 


<p = tan 1 



= constant. 


3. Planes parallel to the xy-plane, as in the cartesian system, 
z = constant. 


The limits on p, (p and z are 

0 < p < oo, 0 < (p < 2n, and — oo < z < oo. 

Note that we are using p for the perpendicular distance from the z-axis and 
saving r for the distance from the origin. 

Inverting the preceding equations for p and cp (or going directly to Fig. 2.3), 
we obtain the transformation relations 
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z 



FIG. 2.3 Circular cylinder coordinates 


x = pcos(p , 

y — p sin <p, (2.28) 

z = z. 

The z-axis remains unchanged. This is essentially a two-dimensional curvilinear 
system with a cartesian z-axis added on to form a three-dimensional system. 
According to Eq. 2.28 or from the length elements ds t , the scale factors are 

hi=h p = 1 , 

h 2 =h <p = p, (2.29) 

h 3 = h z = 1. 

The unit vectors e ls e 2 , e 3 are relabeled (p 0 ,q> 0 ,k), Fig. 2.4. The unit vector 
p 0 is normal to the cylindrical surface pointing in the direction of increasing 
radius p. The unit vector <p 0 is tangential to the cylindrical surface, perpendicular 
to the half plane <p — constant and pointing in the direction of increasing 
azimuth angle cp. The third unit vector, k, is the usual cartesian unit vector. 
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z 



FIG. 2.4 Circular cylindrical 
coordinate unit vectors 


A differential displacement dr ipay be written 

dr = p 0 ds p + cp 0 ds^ + k dz 
— p 0 dp + cp 0 p d(p 4- k dz. 


(2.30) 


The differential operations involving V follow from Eqs. 2.13, 2.17, 2.18, 
and 2.22, 


T7 / / \ # . 1 # - 8lp 

m P ,<P,z) = + 


P dp 


I A + A 

p dcp dz' 




di/A J_ d^ip d^ip 
p dp \ r dp y p 2 d<p 2 dz 2 ’ 


V x V = 


Po P<P o k 

AAA 

dp d(p dz 


v P pK K 


(2.31) 

(2.32) 

(2.33) 

(2.34) 


Finally, for problems such as circular wave guides or cylindrical cavity resona- 
tors the vector Laplacian V 2 V resolved in circular cylindrical coordinates is 

2 dv v 


V 2 v| = V 2 K,-4 tF , , , 

,p p p 2 p p 2 d<p 


V 2 V|; 


V 2 F„ 


a k+ aa 

p 1 <p p 2 dcp 


(2.35) 


v 2 v|, — V 2 V z . 
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The basic reason for the form of the z-component is that the z-axis is a cartesian 
axis ; that is, 

V 2 (Po v p + <p 0 V v + k V z ) = V 2 (p 0 V p 4- (Po v 9 ) + k\ 2 V z 

= Po/(^ n>) + <Po 9(V P , V<p) + kV 2 K 2 . 

The operator V 2 operating on the p 0 , <p 0 unit vectors stays in the p 0 cp 0 -plane. 
This behavior holds in all such cylindrical systems. 


EXAMPLE 2.4. 1 A Navier-Stokes Term 

The Navier-Stokes equations of hydrodynamics contain a nonlinear term 

V x [v x (V x v)], 

where v is the fluid velocity. For fluid flowing through a cylindrical pipe in 
the z-direction 


From Eq. 2.34 


v = k v(p). 


Po m k 

„ 18 8 8 

\ X V — - 

p dp 8(p dz 
0 0 v(p) 

dv 

= ~ <Po 8p 

Po po 0 k 

v x (V x v) = — ° V 

9 o -„|s o 


= Po V (P)- 


Finally, 


V x (v x (V x v)) = 


Po 


k 


_ 8 _ 

8_ 

8p 

8q> 

dz 

dv 

V 8p 

0 

0 


For this particular case the nonlinear term vanishes. 
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EXERCISES 

2.4.1 Resolve the circular cylindrical unit vectors into their cartesian components 
(Fig. 2.5). 



ANS. p 0 = i cos (p + j sin (p , 
tp 0 — — isirup + jcos<p, 
k 0 = k. 

2.4.2 Resolve the cartesian unit vectors into their circular cylindrical components. 

ANS. i = p 0 cos (p — <p 0 sin (p , 
j = p 0 sin cp - 1 - cp 0 cos (p , 
k = k 0 . 


2.4.3 From the results of Ex. 2.4.1 show that 


2.4.4 


dpo 

d<p 





and that all other first derivatives of the circular cylindrical unit vectors with 
respect to the circular cylindrical coordinates vanish. 


Compare V * V (Eq. 2.32) with the gradient operator 


C7 d ( 1 d , 

V-p 0 ^ + 9 o-^ + k 


dz 


(Eq. 2.31) dotted into V. Note that the differential operators of V differentiate 
both the unit vectors and the components of V. 


Hint. <p 0 (l/p)(5/S<p)-Po V„ 


becomes (p 0 • - — (p 0 VJ and does not vanish. 
P d<P 


2.4.5 (a) Show that r = p 0 p + k 0 z. 

(b) Working entirely in circular cylindrical coordinates, show that 


V • r = 3 and V x r = 0. 
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2.4.6 (a) Show that the parity operation (reflection through the origin) on a point 

(p, <p,z) relative to fixed x-, y z- axes consists of the transformation 

P~*P 
<P <p i 7C 
z — ► — z 

(b) Show that p 0 and <p 0 have odd parity (reversal of direction) and that k 
has even parity. 

Note. The cartesian unit vectors i, j, and k remain constant. 

2.4.7 A rigid body is rotating about a fixed axis with a constant angular velocity <o. 
Take co to lie along the z-axis. Express r in circular cylindrical coordinates and 
using circular cylindrical coordinates, 

(a) calculate v = co x r. 

(b) calculate V x v. 

A NS. (a) v = cp 0 cop 
(b) V x v = 2ro 

2.4.8 A particle is moving through space. Find the circular cylindrical components 
of its velocity and acceleration. 

v P = p, a p = P~ p<i > 2 > 
v v = P9, «„ = p<i> + 2 p<p . 

v z = z a z — z. 

Hint. 

r(0 = Po(OP(0 + MO 

= [icos(p(0 + jsin<jt>(0]p(0 + kz(0- 

Note, p = dpjdt , p = d 2 pjdt 2 , and so on. 

2.4.9 Solve Laplace’s equation \ 2 ip — 0, in cylindrical coordinates for ip = ij/(p). 

ANS. \j/ = fcln— 
Po 

2.4.1 0 In right circular cylindrical coordinates a particular vector function is given by 

V(p, (p) = p 0 F p (p, <p) + <p 0 F^(p, <p). 

Show that V x V has only a z-component. Note that this result will hold for 
any vector confined to a surface q 3 — constant as long as the products h 1 V x and 
h 2 V 2 are each independent of q 2 . 

2.4.11 For the flow of an incompressible viscous fluid the Navier-Stokes equations 
lead to 


-Vx(?x(Vx v)) = Xv 2 (V x v). 

Po 

Here q is the viscosity and p 0 the density of the fluid. For axial flow in a cylindrical 
pipe we take the velocity v to be 

v = ki;(p). 


From Example 2.4.1 


for this choice of v. 


Vx(vx(Vxv)) = 0 
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Show that 


V 2 (V x v) = 0 

leads to the differential equation 

1 d / d 2 v\ _ 1 dv 
P dp X dp 2 ) p dp 

and that this is satisfied by 

v = v 0 + a 2 p 2 . 


2 . 4.12 A conducting wire along the z-axis carries a current I. The resulting magnetic 
vector potential is given by 


A = 


k— In 



Show that the magnetic induction B is given by 


B = <p 0 


2np 


2 . 4.1 3 A force is described by 

_ . v . x 

F= — 1— r 2 2 • 

X 2 + y 2 X 2 + y 2 

(a) Express F in circular cylindrical coordinates. 

Operating entirely in circular cylindrical coordinates for (b) and (c), 

(b) calculate the curl of F and 

(c) calculate the work done by F in encircling the unit circle once counter- 
clockwise. 

(d) How do you reconcile the results of (b) and (c)? 


2 . 4.1 4 A transverse electromagnetic wave (TEM) in a coaxial wave guide has an electric 
field E = E(p, </))e i(kz-co ° and a magnetic induction field of B = B (p,<p)e l(fcz_caf) . 
Since the wave is transverse neither E nor B has a z component. The two fields 
satisfy the vector Laplacian equation 

V 2 E(p,<p) = 0 

V 2 B(p, <p) = 0. 

(a) Show that E = p 0 E 0 (a/p)e i{kz ’ tot) and B - <p 0 B 0 (a/p)e i{kz ~ b)t) are solutions. 
Here a is the radius of the inner conductor and E 0 and B 0 are amplitudes. 

(b) Assuming a vacuum inside the wave guide, verify that Maxwell’s equations 
are satisfied with 

BJE 0 = k/<a = p 0 e 0 ((o/k) = 1/c. 


2 . 4.1 5 A calculation of the magnetohydrodynamic pinch effect involves the evaluation of 
(B» V)B. If the magnetic induction B is taken to be B = (p 0 ^(p), show that 

(B*V)B= — p 0 B 2 /p. 

2 . 4.16 The linear velocity of particles in a rigid body rotating with angular velocity 
co is given by 

v = <p 0 P w 

Integrate | v • dX around a circle in the x>- plane and verify that 

§\-dX 



102 COORDINATE SYSTEMS 


2.5 SPHERICAL POLAR COORDIHATES (r, 0, cp) 

Relabeling (q x , # 2 , # 3 ) as (r, 0, <p), we see that the spherical polar coordinate 
system consists of the following : 

1 . Concentric spheres centered at the origin, 

r = (x 2 ~f y 2 + z 2 ) 1/2 = constant. 

2. Right circular cones centered on the z-(polar) axis, 
vertices at the origin, 

0 = arc cos - 9 7 Z = constant. 

(x 2 + y 2 - h z 2 ) 1/2 

3. Half planes through the z-(polar) axis, 

y 

c p — arc tan - = constant. 

x 

By our arbitrary choice of definitions of 0, the polar angle, and cp, the azimuth 
angle, the z-axis is singled out for special treatment. The transformation 
equations corresponding to Eq. 2.2 are 

x = r sin 0 cos (p , 

y = r sin 0 sin cp , (2.36) 

z = rcos0, 

measuring 0 from the positive z-axis and cp in the xy-plane from the positive 
x-axis. The ranges of values are O<r<oo,O<0<7i, and 0 < (p < 2n. From 
Eq. 2.6 

h i =h r = 1, 

hi — h e = r, (2.37) 

h$ = h <p = rsin0. 

This gives a line element 

dr = r 0 dr -f Q 0 rd6 + <p 0 r sin 0 d(p. 

In this spherical coordinate system the area element (for r = constant) is 

dA = do Q(p = r 2 sin8d6d(p, (2.38) 

the dark, shaded are in Fig. 2.6. Integrating over the aximuth (p , we find that 
the area element becomes a ring of width (70, 

dA — 2nr 2 sin 6 d6. (2.39) - 

This form will appear repeatedly in problems in spherical polar coordinates 
with azimuthal symmetry — such as the scattering of an unpolarized beam of 
nuclear particles. By definition of solid radians or steradians, an element of 
solid angle dQ is given by 
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Integrating over the entire spherical surface, we obtain 


dQ = 4n. 


From Eq. 2.1 1 the volume element is 

dx = r 2 dr sin 6 d6 dcp 
= r 2 drdQ. 


(2.41) 


The spherical polar coordinate unit vectors are shown in Fig. 2.7. 

It must be emphasized that the unit vectors r 0 , 0 O , cind <p 0 vary in direction as the 
angles 6 and (p vary . Specifically, the 6 and (p derivatives of these spherical polar 
coordinate unit vectors do not vanish (Exercise 2.5.2). When differentiating 
vectors in spherical polar (or in any noncartesian system) this variation of the 
unit vectors with position must not be neglected. In terms of the fixed direction 
cartesian unit vectors i, j, and k, 

r 0 = i sin 0 cos <p + j sin 0 sin (p + k cos 0, 

0 O = i cos 6 cos (p + j cos 6 sin cp — k sin 0, (2.42) 

<p 0 = — isincp + jcos<p. 

Note that a given vector can now be expressed in a number of different (but 
equivalent) ways. For instance, the position vector r may be written 


r = r 0 r 

= r 0 (x 2 + y 2 + z 2 ) 112 
= ix + \y + kz 

= irsin0cos<p + jrsin0sin(p + krcos0. 


(2.43) 
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z 



FIG. 2.7 Spherical polar coordinates 


Select the form that is most useful for your particular problem. 

From Section 2.2, relabeling the curvilinear coordinate unit vectors e 1? e 2 , 
and e 3 as r 0 , 0 O , and <p 0 gives 


T7 / _ # , n 1 # , 1 # 

— r 0 — — F 0 o Z77 + <Po • 

dr r d6 rsin0 dip 


V*V = 


= 


V x V- 


1 


r 2 sin 6 
1 

r 2 sin 0 

1 


8 ^ 


si n 0-(r»F,) + ,-_< s i„f>r,) + , ^ 


d(p_ ’ 


r 2 sin 0 


Tsin0 — 

+ 

L 

dr 

dr) 

r o 

rQ o 

rsin0<p o 

d_ 

d 

d 

dr 

se 

dip 

K 

rV e 

rsinOVy 


50 


1 dhjj' 
sin 6 dcp 2 


(2.44) 

(2.45) 

(2.46) 


(2.47) 


Occasionally, the vector Laplacian V 2 V is needed in spherical polar co- 
ordinates. It is best obtained by using the vector identity (Eq. 1 .80) of Chapter 1 . 
For future reference 
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2 2 8 d 2 cos 9 g ig 2 1 a 2 \ 

r 2 r dr 2 r 2 sin 9 d6 r 2 d9 2 r 2 sin 2 9 dcp 2 ) 


+ 



2 cos (A 
r 2 sin 9 ) 


K + 


— ? — fL) v 

r 2 sin0 dcp) <p 


(2.48) 



2_dK 
r 2 89 


2 cos 9 _ 2 8Vy 

r 2 sin 9 9 r 2 sin 9 d(p ’ 


V 2 V| 9 = V 2 F e - 
V 2 V|„ = V 2 F„- 


1 2 <9F r 2 cos 0 dV^ 

r 2 sin 2 0 0 r 2 89 r 2 sin 2 6 dcp ’ 

_J 2 dF 2cosfl dF 0 

r 2 sin 2 9 * r 2 sin 9 dcp r 2 sin 2 9 dcp * 


(2.49) 

(2.50) 


These expressions for the components of V 2 V are undeniably messy, but 
sometimes they are needed. There is no guarantee that nature will always be 
simple. 


EXAMPLE 2.5.1 


Using Eqs. 2.44 to 2.47, we can reproduce by inspection some of the results 
derived in Chapter 1 by laborious application of cartesian coordinates. 

From Eq. 2.44 


From Eq. 2.45 


v/W = r °f, 

Vr” = r 0 nr n ~ 


Vr 0 f(r) =- r f(r) + $-, 


n - 1 


From Eq. 2.46 


V • r 0 r n = (n + 2)r 


r dr dr 2 

\ 2 r" = n(n + l)r" -2 . 


Finally, from Eq. 2.47 


V x r 0 /(r) = 0. 

EXAMPLE 2.5.2 Magnetic Vector Potential 


(2.51) 


(2.52) 

(2.53) 

(2.54) 

(2.55) 


The computation of the magnetic vector potential of a single current loop 
in the vj-plane involves the evaluation of 
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V = V x [V x (p 0 A v (r, 0)]. 
In spherical polar coordinates this reduces as follows : 

r 0 r0 o rsin0cp o 

A A A 

dr dO dq> 

0 0 rsinOAyir, 6) 

Taking the curl a second time, we obtain 




r o 

8_ 

8r 


1 8 

r 2 sin 6 dd 


(rsindAy) 


CD 

o 

r sin 0q>, 

8 

A 

80 

8(p 

r S ineS^ SnllA ^ 

0 


By expanding the determinant, we have 

= -<Po ^ A<p {r,0)- ? A^A v (r,0^. 


(2.56) 


(2.57) 


(2.58) 


In Chapter 12 we shall see that V leads to the associated Legendre equation and 
that Ay may be given by a series of associated Legendre polynomials. 


EXERCISES 

2.5.1 Resolve the spherical polar unit vectors into their cartesian components. 

ANS. r 0 = i sin 6 cos cp 4- j sin 6 sin <p + k cos 6 , 
0 O = i cos 6 cos (p + j cos 6 sin <p — k sin 6, 
ip 0 — — i sin ip + j cos (p . 

2.5.2 (a) From the results of Exercise 2.5.1 calculate the partial derivatives of r 0 , 

0 O , and <p 0 with respect to r, 6, and <p . 

(b) With V given by 

d 0 1 d 1 d 

dr r 39 r sm 0 dip 

(greatest space rate of change), use the results of part (a) to calculate V • Vi j/. 
This is an alternate derivation of the Laplacian. 

Note. The derivatives of the left-hand V operate on the unit vectors of the right- 
hand V before the unit vectors are dotted together. 

2.5.3 A rigid body is rotating about a fixed axis with a constant angular velocity «. 
Take co to be along the 2 -axis. Using spherical polar coordinates. 
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(a) Calculate 

(b) Calculate 


v = w x r. 


V x v. 


A NS. (a) v = <p 0 cor sin 9 
(b) V x v = 2 ( 0 . 

2.5.4 The coordinate system (x,y,z) is rotated through an angle <3> counterclockwise 
about an axis defined by the unit vector n into system (x\y\z'). In terms of the 
new coordinates the radius vector becomes 

x' — rcos® + r x nsin<2> + n(n»r)(l — cosO). 

(a) Derive this expression from geometric considerations. 

(b) Show that it reduces as expected for n = k. The answer, in matrix form, 
appears in Section 4.3. 

(c) Verify that r' 2 — r 2 . 

2.5.5 Resolve the cartesian unit vectors into their spherical polar components. 

i = r 0 sin 6 cos (p + 0 O cos 0 cos (p — cp 0 sin (p, 
j = r 0 sin 0 sin <p + 0 O cos 0 sin cp + cp 0 cos <p, 
k = r 0 cos 6 — 0 O sin 6. 

2.5.6 The direction of one vector is given by the angles and <p x . For a second vector 
the corresponding angles are 0 2 and <p 2 . Show that the cosine of the included 
angle y is given by 

cos y = cos 6 l cos 0 2 + sin 6 X sin 0 2 cos (<p 1 — (p 2 ). 

See Fig. 12.16. 

2.5.7 A certain vector V has no radial component. Its curl has no tangential com- 
ponents. What does this imply about the radial dependence of the tangential 
components of V? 

2.5.8 Modern physics lays great stress on the property of parity — whether a quantity 
remains invariant or changes sign under an inversion of the coordinate system. 
In cartesian coordinates this means x -> — jc, y -+ —y, and z -> ~z. 

(a) Show that the inversion (reflection through the origin) of a point (r, 0, cp) 
relative to fixed x-, y- 9 z-axes consists of the transformation 

r-+r 9 

0 ->n — 0, 

(p -> (p ±n. 

(b) Show that r 0 and cp 0 have odd parity (reversal of direction) and that 0 O 
has even parity. 

2.5.9 With A any vector 

A-Vr = A. 

(a) Verify this result in cartesian coordinates. 

(b) Verify this result using spherical polar coordinates. (Eq. 2.44 provides V.) 
In the language of dyadics (Section 3.5), Vr is the indemfactor, a unit dyadic. 
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2.5.10 A particle is moving through space. Find the spherical coordinate components 
of its velocity and acceleration : 

v r = r, 

v e = r6, 

Vy = r sin 6<p, 

a r ~ r — r6 2 — r sin 2 6(p 2 , 

a e = rO + 2 r9 — r sin 0 cos 0<p 2 , 

a (p = r sin 0$ + 2r sin 6(p + 2 r cos 66(f). 

Hint. 


r (0 = r o (0K0 

= [i sin 0(/) cos </>(/) + j sin 6(t) sin (p(t) + k cos 0(O]KO- 

Using the Lagrangian techniques of Section 17.3, we may obtain these 
results somewhat more elegantly. The dot in r means time derivative, r = drjdt. 
The notation was originated by Newton. 


2.5.1 1 A particle m moves in response to a central force according to Newton’s second 
law 


mi = r 0 /(r). 

Show that r x r = c, a constant and that the geometric interpretation of this 
leads to Kepler’s second law. 


2.5.1 2 Express d/dx, d/dy, djdz in spherical polar coordinates. 


ANS. 


— - = sin 6 cos (o — + cos 6 cos (p- — 
dx dr r 36 


sin (p 3 
r sin 6 3(p ’ 


3 . n . 3 n . 1 3 

— = sin0sin(p~- + cos 6 sin cp h 

3y 3r r 36 


cos cp 3 
r sin 6 3<p ’ 


3 a 3 . a 1 3 

— cos 6 sin 6 . 

3z 3r r 36 


Hint. Equate \ xyz and Y r9<p 


2.5.13 From Exercise 2.5.12 show that 



This is the quantum mechanical operator corresponding to the z-component 
of angular momentum. 


2.5.14 With the quantum mechanical angular momentum operator defined as L = 
— i(r x V), show that 

(a) L, + ,X,- 
<w 

These are the raising and lowering operators of Sections 12.6 and 7. 
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2.5.1 5 Verify that L x L = iL in spherical polar coordinates. L = — /( r x V), the quan- 
tum mechanical angular momentum operator. 

Hint. Use spherical polar coordinates for L but cartesian components for the 
cross product. 


2.5.1 6 (a) From Eq. 2.44 show that 


(b) 

(c) 


Resolving 0 O and <p 0 into cartesian components, determine L x , L y , and L z 
in terms of 6 , <p, and their derivatives. 

From L 2 — L 2 L 2 L 2 show that 


L 2 


H)' 


-r 2 \ 2 + 


i d± 

sin 2 9 d<p 2 


2.5.1 7 With L = -ir x V verify the operator identities 


(a) V = r °^ -i 


.r x 


L 


(b) rV 2 


-( 1+ ,|)=,Tx 


L. 


This latter identity is useful in relating angular momentum and Legendre’s 
differential equation, Exercise 8.3.1. 


2.5.18 Show that the following three forms (spherical coordinates) of are equiva- 

lent. 



(b) 


dhjHr) 2 #(r) 

dr 2 r dr 

The second form is particularly convenient in establishing a correspondence 
between spherical polar and cartesian descriptions of a problem. A generalization 
of this appears in Exercise 8.6.11. 


2.5.1 9 One model of the solar corona assumes that the steady-state equation of heat flow 

V ‘(k\T) = 0 

is satisfied. Here, k , the thermal conductivity, is proportional to T 512 . Assuming 
that the temperature T is proportional to r", show that the heat flow equation 
is satisfied by T — T 0 (r 0 /r) 211 . 


2.5.20 A certain force field is given by 

2 P cos 6 


F = r 


o 


+ 0 O sin 6, 
r * 


(in spherical polar coordinates). 


r > PI 2, 
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(a) Examine V x F to see if a potential exists. 

(b) Calculate |F • dk for a unit circle in the plane 6 = n/2. 

What does this indicate about the force being conservative or noncon- 
servative? 

(c) If you believe that F may be described by F = — find \p. Otherwise 
simply state that no acceptable potential exists. 


2.5.21 (a) Show that A = — <p 0 cot 6/r is a solution of V x A — r 0 /r 2 . 

(b) Show that this spherical polar coordinate solution agrees with the solution 
given for Exercise 1.13.5: 

A = i 2* j ™ . 

r(x 2 + y 2 ) J r(x 2 + y 2 ) 

Note that the solution diverges for 9 = 0, n corresponding to x, y = 0. 

(c) Finally, show that A = — d 0 tpsin6/r is a solution. Note that although this 
solution does not diverge (r =£ 0) it is no longer single-valued for all possible 
azimuth angles. 


2.5.22 A magnetic vector potential is given by 

A _ m x r 
4n r 3 

Show that this leads to the magnetic induction B of a point magnetic dipole, 
dipole moment m. 

A NS. For m = k m, 

ct „ a _ - Vo 2mcos6 ( a fi 0 m sin0 

YXA — r 0 3 tOo 3 

4n r 4n r 

Compare Eqs. 12.136 and 12.137. 


2.5.23 At large distances from its source, electric dipole radiation has fields 

e i(kr-a>t ) 

E = a E sinO 0 O , B^a^sin# — q> 0 . 

r r 

Show that Maxwell’s equations 

VxE=-~ and VxB = e 0 /i 0 |5 
are satisfied, if we take 


a E 

a B 


T = c = (e 0 HoY V2 - 
k 


Hint. Since r is large, terms of order r 2 may be dropped. 


2.5.24 The magnetic vector potential for a uniformly charged rotating spherical shell is 


A = 


fiQa^aco sin 6 


<Po 


fi 0 aaa) 


r cos 6, 


r > a 

r < a. 


(a = radius of spherical shell, <7 surface charge density, and to angular velocity.) 
Find the magnetic induction B = V x A. 
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ANS . 


r> / n\ 2 n 0 a 4 O(D COS 6 

B r (r , 6) = —°— 3 s 

3 r 

s». 

3 

B = k 2/^co 


r > 0 

r > a 

r < a. 


2.5.25 (a) Explain why V 2 in plane polar coordinates follows from V 2 in circular 
cylindrical coordinates with z = constant. 

(b) Explain why taking V 2 in spherical polar coordinates and restricting 6 
to njl does not lead to the plane polar form of V 2 . 

Note. 


V 2 (p,<p) = 


- + - 


1 d , 1 d 2 


+ 


dp 2 p dp p 2 d(p 


2.6 SEPARATION OF VARIABLES 


CARTESIAN COORDINATES 

In cartesian coordinates the Helmholtz equation (Eq. 2.1) becomes 


d 2 ij/ d 2 ijj d 2 ij/ 

a? + + 'd? 


+ k 2 \j/ = 0 , 


(2.59) 


using Eq. 2.26 for the Laplacian. For the present let k 2 be a constant. Perhaps 
the simplest way of treating a partial differential equation such as 2.59 is to 
split it into a set of ordinary differential equations. This may be done as follows : 
Let 


» Hx,y,z) = X(x) Y(y)Z(z) (2.60) 

and substitute back into Eq. 2.59. How do we know Eq. 2.60 is valid? The 
answer is very simple. We do not know it is valid ! Rather, we are proceeding 
in the spirit of let’s try and see if it works. If our attempt succeeds, then Eq. 2.60 
will be justified. If it does not succeed, we shall find out soon enough and then 
we shall try another attack such as Green’s functions, integral transforms, 
or brute force numerical analysis. With i // assumed given by Eq. 2.60, Eq. 2.59 
becomes 


rf2 y y j 2 y 

YZ^-4 -V XZ Vy XY^-4 -v k 2 XYZ = 0. 
dx 1 dy L dir 

Dividing by i j/ = XYZ and rearranging terms, we obtain 

1 d 2 X 2 1 d 2 Y 1 d 2 Z 

Xdx 2 ~ Y dy 2 Z dz 2 ' 


( 2 . 61 ) 


(2.62) 


Equation 2.62 exhibits one separation of variables. The left-hand side is a 
function of x alone, whereas the right-hand side depends only on y and z. 
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So Eq. 2.62 is a sort of paradox. A function of x is equated to a function of y 
and z, but x, y, and z are all independent coordinates. This independence means 
that the behavior of x as an independent variable is not determined by y and z. 
The paradox is resolved by setting each side equal to a constant, a constant of 
separation. We choose 1 


1 d 2 X /2 

X dx 2 ~ ’ 

(2.63) 

l2 1 d 2 Y 1 d 2 Z , 2 

* Y dy 1 Z dz 2 = 

(2.64) 

Now, turning our attention to Eq. 2.64, we obtain 


1 d 2 Y_ 2 2 1 d 2 Z 

Y dy 2 Z dz 2 ’ 

(2.65) 

and a second separation has been achieved . Here we have a function of y equated 
to a function of z and the same paradox appears. We resolve it as before by 
equating each side to another constant of separation, — m 2 . 

1 d 2 Y 2 

Y dy 2 ~ m ’ 

(2.66) 

—■ = — k 2 + l 2 + m 2 = —n 2 , 

Z dz z 

(2.67) 

introducing a constant n 2 by k 2 = l 1 + m 2 -f n 2 to produce a symmetric set of 
equations. Now we have three ordinary differential equations (2.63, 2.66, and 
2.67) to replace Eq. 2.59. Our assumption (Eq. 2.60) has succeeded and is 
thereby justified. 

Our solution should be labeled according to the choice of our constants /, 
m, and n, that is, 

>h m Jx,y,z) = X,(x) Y m (y)Z„(z). 

(2.68) 


Subject to the conditions of the problem being solved and to the condition 
k 2 — l 2 -f m 2 -f n 2 , we may choose /, m, and n as we like, and Eq. 2.68 will 
still be a solution of Eq. 2.1, provided X t (x) is a solution of Eq. 2.63, and so on. 
We may develop the most general solution of Eq. 2. 1 by taking a linear com- 
bination of solutions \ \j/ lmn , 

y = I a lm J lmn . (2.69) 

l,m,n 

The constant coefficients a lmn are finally chosen to permit ¥ to satisfy the 
boundary conditions of the problem. 


lr The choice of sign, completely arbitrary here, will be fixed in specific 
problems by the need to satisfy specific boundary conditions. 
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LINEAR OPERATORS 

How is this possible? What is the justification for writing Eq. 2.69? The 
justification is found in noting that V 2 + k 2 is a linear (differential) operator. 
A linear operator if is defined as an operator with the following two properties : 

if(i ai) /) = 

where a is a constant and 

if(i/q + \jj 2 ) ~ + if^ 2 - 

The derivatives d n /dx n and the integral J [ ] dx are examples of linear 
operators. The square ( ) 2 and sin are examples of nonlinear operators. 
In general, (ax) 2 =/= ax 2 and sin(0 + cp) =£ sin 6 + sirup. As a consequence of 
the defining properties, any linear combination of solutions of a linear differ- 
ential equation is also a solution. From its explicit form V 2 -f k 2 is seen to 
these two properties (and is therefore a linear operator). Equation 2.69 then 
follows as a direct application of these two defining properties. 2 

A further generalization may be noted. The separation process just described 
would go through just as well for 

k 2 =f(x) + g(y) + h(z) + k ' 2 , (2.70) 

with k /2 a new constant. 

We would simply have 

y $+/»=-'’ <^> 

replacing Eq. 2.63. The solutions X , 7, and Z would be different, but the 
technique of splitting the partial differential equation and of taking a linear 
combination of solutions would be the same. 

In case the reader wonders what is going on here, this technique of separation 
of variables of a partial differential equation has been introduced to illustrate 
the usefulness of these coordinate systems. The solutions of the resultant 
ordinary differential equations are developed in Chapters 8 through 13. 

CIRCULAR CYLINDRICAL COORDINATES 

With our unknown function i j/ dependent on p, <p, and z, the Helmholtz 
equation becomes 

(P, z) + k 2 \p(p, (p, z) = 0, (2.72) 


or 


1A 

P dp 



+- k 2 \p = 


n 


(2.73) 


2 We are especially interested in linear operators because in quantum mechan- 
ics physical quantities are represented by linear operators operating in a 
complex, infinite dimensional Hilbert space. 
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As before, we assume a factored form for t//, 

' = P(p)®((p)Z(z). 
Substituting into Eq. 2.73, we have 


<S>Zdf dP\ PZ d 2 <D 


p dp\dp) ' p 2 dcp 


d 2 7 

P<D Vr + k 2 = 0. 

dz z 


(2.74) 


(2.75) 


All the partial derivatives have become ordinary derivatives. Dividing by 
and moving the z derivative to the right-hand side yields 


J_d_ 
pP dp 


dP 


+ 


dp ) p 2 Q> dcp 


+ Z dz 2 • 


(2.76) 


Again, we have the paradox. A function of z on the right appears to depend 
on a function of p and cp on the left. We resolve the paradox by setting each 
side of Eq. 2.76 equal to a constant, the same constant. Let us choose 3 —l 2 . 
Then 


d 2 Z 

dz 2 


= / 2 Z, 


(2.77) 


and 




pP dp \ dp ) p 2 < dcp 


(2.78) 


Setting k 2 + l 2 = « 2 , multiplying by p 2 , and rearranging terms, we obtain 


p d / dP\ 


P dp V" dp) + " 2p2 
We may set the right-hand side equal to m 2 and 

d 2 <$> 


1 d 2 <b 
<I> d<p 2 


dcp 2 


= -m 2 d>. 


Finally, for the p dependence we have 


dP 


l + <"V— V- 0 - 


(2.79) 


(2.80) 


(2.81) 


This is Bessel’s differential equation. The solutions and their properties are 
presented in Chapter 1 1 . 

The original Helmholtz equation, a three-dimensional partial differential 
equation, has been replaced by three ordinary differential equation, Eqs. 2.77, 
2.80, and 2.81. A solution of the Helmholtz equation is 


ij/(p, cp, z) = P(p)d>(cp)Z(z). 


(2.74) 


3 The choice of sign of the separation constant is arbitrary. However, a minus 
sign is chosen for the axial coordinate z in expectation of a possible exponential 
dependence on z (from Eq. 2.77). A positive sign is chosen for the azimuthal 
coordinate <p in expectation of a periodic dependence on <p (from Eq. 2.80). 
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Identifying the specific P, O, Z solutions by subscripts, we see that the most 
general solution of the Helmholtz equation is a linear combination of the 
product solutions : 

'J'O, <P, Z) = E a mn P mn(P)®m(<P) Z »( Z )- ( 2 - 82 ) 

m,n 


SPHERICAL POLAR COORDINATES 

Let us try to separate Eq. 2.1, again with k 2 constant, in spherical polar 
coordinates. Using Eq. 2.46, we obtain 

d 2 f 

sin 6 dcp 2 


r 2 sin 


b[ sine i 


; ^ 


dr 


+ l( sin0 f 


+ 




(2.83) 


Now, in analogy with Eq. 2.60 we try 

<H'\0, (p) = R(r)&(0)<t>(<p). 

By substituting back into Eq. 2.83 and dividing by /?©<!>, we have 


1 d 


dR 


T t'b + 


1 


Rr 2 dr\ dr ©r 2 sin 0 d8 


dl ^) + 


d6 )' 


1 


</ 2 0 


= -k 2 


(2.84) 


(2.85) 


Or 2 sin 2 9 d(p 2 

Note that all derivatives are now ordinary derivatives rather than partials. By 
multiplying by r 2 sin 2 0, we can isolate (1 l$>)(d 2 <&/d(p 2 ) to obtain 4 

1 d 2 d> 


O dcp' 


— r 2 sin 2 6 


1 d 

r 2 R dr 


dR 

dr 


1 d \zme d ® 

r 2 sin^0^ol Sin dO 


( 2 . 86 ) 


Equation 2.86 relates a function of cp alone to a function of r and 9 alone. 
Since r, 6, and cp are independent variables, we equate each side of Eq. 2.86 to a 
constant. Here a little consideration can simplify the later analysis. In almost 
all physical problems cp will appear as an azimuth angle. This suggests a periodic 
solution rather than an exponential. With this in mind, let us use — m 2 as the 
separation constant. Any constant will do, but this one will make life a little 
easier. Then 


i <m«» = _ m2 


O dcp 2 


and 


1 d ( 2 dR\ 
r 2 R dr \ dr ) 


1 d ( . d& 
r 2 sin 0© d6 \ n dO 


n 7“ 


r 2 sin 2 0 


= -k 2 


Multiplying Eq. 2.88 by r 2 and rearranging terms, we obtain 


1 d ( odR 


R dr V dr 


+ r 2 k 2 =3 — r 


1 d / . do 
sin 00 ^0\ Sin ^ dO 


+ 




sin 2 0‘ 


(2.87) 


( 2 . 88 ) 


(2.89) 


Again, the variables are separated. We equate each side to a constant Q and 
finally obtain 


4 The order in which the variables are separated here is not unique. Many 
quantum mechanics texts show the r dependence split off first. 



116 COORDINATE SYSTEMS 


1 d 
sin 8 dO 


sin# 


1 d 


r 2 dr \ dr 


d& 

d8 

dR 


™e + QQ = 0, 

sm 2 8 


^t r 2^\ + k 2 R _QR = 0 ' 


(2.90) 

(2.91) 


Once more we have replaced a partial differential equation of three variables by 
three ordinary differential equations. The solutions of these ordinary differential 
equations are discussed in Chapters 11 and 12. In Chapter 12, for example, 
Eq. 2.90 is identified as the associated Legendre equation in which the constant 
Q becomes /(/ + 1) ; /is an integer. If k 2 is a (positive) constant, Eq. 2.9 1 becomes 
the spherical Bessel equation of Section 1 1.7. 

Again, our most general solution may be written 

I ItQmir, 8,(p)= X A o (r)0 Qm (0)O m (<p). (2.92) 

<2,m 

The restriction that k 2 be a constant is unnecessarily severe. The separation 
process will still be possible for k 2 as general as 

k 2 =/(r) + g(8) + pAy q K<P) + k' 2 . (2.93) 


In the hydrogen atom problem, one of the most important examples of the 
Schrodinger wave equation with a closed form solution, we have k 2 = f(r). 
Equation 2.91 for the hydrogen atom becomes the associated Laguerre equation. 

The great importance of this separation of variables in spherical polar 
coordinates stems from the fact that the case k 2 = k 2 (r) covers a tremendous 
amount of physics : a great deal of the theories of gravitation, electrostatics, 
atomic physics, and nuclear physics. And, with k 2 = k 2 (r), the angular depen- 
dence is isolated in Eqs. 2.87 and 2.90, which can he solved exactly. 

Separation of variables and an investigation of the resulting ordinary 
differential equations are discussed again in Section 8.3. 


EXERCISES 


2 . 6.1 By letting the operator V 2 4- k 2 act on the general form a^fx.y.z) + ^ 2^2 
show that it is linear, that is, (V 2 + k 2 ){a x \lj x + u 2 ^ 2 ) = a x iy 2 + k 2 )\p i + tf 2 (V 2 + 

k 2 )^ 

2 . 6.2 Show that the Helmholtz equation 

\ 2 ij/ + k 2 \j/ = 0 

is still separable in circular cylindrical coordinates if k 2 is generalized to k 2 + 
Ap) + (Mp 2 )g{<p) + h(z). 

2 . 6.3 Separate variables in the Helmholtz equation in spherical polar coordinates split- 
ting off the radial dependence first . Show that your separated equations have the 
same form as Eqs. 2.87, 2.90, and 2.91. 


2 . 6.4 Verify that 
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V 2 i Hr, 0, (p) + [" k 2 +/(r) + ~g(9) + — -/?(<?)! >P(r, 0,<p) = 0 

| r 2 r 2 sin 2 0 J 

is separable (in spherical polar coordinates). The functions /, g, and h are functions 
only of the variables indicated; k 2 is a constant. 

2 . 6.5 An atomic (quantum mechanical) particle is confined inside a rectangular box of 
sides a, b , and c. The particle is described by a wave function i p which satisfies the 
Schrodinger wave equation 

h 2 

VV = Ex!/. 

2m 


The wave function is required to vanish at each surface of the box (but not to be 
identically zero). This condition imposes constraints on the separation constants 
and therefore on the energy E. What is the smallest value of E for which such a 
solution can be obtained? 


ANS. E 


n 2 h 2 f 1 
2m \a 2 


+ V 2 + \ 


>) 


2 . 6.6 For a homogeneous spherical solid with constant thermal diffusivity, K , and no 
heat sources the equation of heat conduction becomes 


dT(r 9 tj 

dt 


KV 2 T(r , t). 


Assume a solution of the form 

T=R(r)T(t) 

and separate variables. Show that the radial equation may take on the standard 
form 

r 2 ^_E 2 + [a 2 r 2 — n(n -b 1)]?? = 0; n = integer. 

dr 2 dr 

The solutions of this equation are called spherical Bessel functions. 

2 . 6.7 Separate variables in the thermal diffusion equation of Exercise 2.6.6 in circular 
cylindrical coordinates. Assume that you can neglect end effects and take T — 
T(pj). 


Additional exercises on separation of variables appear at the end of Section 8.3. 
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3 TENSOR 
ANALYSIS 


3.1 INTRODUCTION, DEFINITIONS 

Tensors are important in many areas of physics, including general relativity 
and electromagnetic theory. One of the more prolific sources of tensor quan- 
tities is the anisotropic solid. Here the elastic, optical, electrical, and magnetic 
properties may well involve tensors. The elastic properties of the anisotropic 
solid are considered in some detail in Section 3.6. As an introductory illustra- 
tion, let us consider the flow of electric current. We can write Ohm’s law in the 
usual form 

J = <rE, (3.1) 

with current density J and electric field E, both vector quantities. 1 If we have an 
isotropic medium, ( 7 , the conductivity, is a scalar, and for the x-component, for 
example, 

J x = gE 1 . (3.2) 

However, if our medium is anisotropic, as in many crystals, or a plasma in the 
presence of a magnetic field, the current density in the x-direction may depend 
on the electric fields in the y- and z-directions as well as on the field in the 
x-direction. Assuming a linear relationship, we must replace Eq. 3.2 with 

J 1 = a ll E 1 + o l2 E 2 + cr i3 E 3 , (3.3) 

and, in general, 

J i = 'L° ik E k - (3-4) 

k 

For ordinary three-dimensional space the scalar conductivity a has given way 
to a set of nine elements, a ik . 



This array of nine elements actually forms a tensor, as shown in Section 3.3. 

1 Another example of this type of physical equation appears in Section 4.6. 
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Generalizing Eq. 3.1, if a relation A = B C with A and C as nonparallel 
vectors holds in all orientations of a cartesian system, then B is a (second-rank) 
tensor. This is proven in Section 3.3. Other physical problems giving rise to 
tensors include elasticity (Section 3.6), electromagnetism (Section 3.7), the 
inertia matrix (Section 4.6), and above all, general relativity. 

In Chapter 1 a quantity that did not change under rotations of the coordinate 
system, an invariant quantity, was labeled a scalar. A quantity whose com- 
ponents transformed like those of the distance of a point from a chosen origin 
(Eq. 1.9, Section 1.2) was called a vector. This transformation property was 
adopted as the defining characteristic of a vector. The transformation of the 
components of the vector under a rotation of the coordinates just preserves the 
vector as a geometric entity (such as an arrow in space), independent of the 
orientation of the reference frame. 

There is a possible ambiguity in this transformation definition of vector 
(Eq. 3.6), 

A 'i = 'L a ij A j> ( 3 -6) 

j 


in which a is the cosine of the angle between the x--axis and the x-axis. 

If we start with a differential distance vector dr, then, taking dx\ to be a 
function of the unprimed variables, 




by partial differentiation. If we set 


dxi 




(3.7) 


(3.8) 


Eqs. 3.6 and 3.7 are consistent. Any set of quantities Aj transforming according 
to 


j v *j 


(3.9) 


is defined as a contravariant vector. 

However, we have already encountered a slightly different type of vector 
transformation. The gradient of a scalar, \(p, defined by 


\q> 


. Cip . 
CX j 


d<P | 

8x 2 dx 3 


(3.10) 


(using x x , x 29 x 3 for x,y, z), transforms as 

dx[ j dxj dx\ ’ 

using cp = cp(x,y,z ) = (p(x\y',z ') = cp', <p defined as a scalar quantity. Notice 
that this differs from Eq. 3.9 in that we have dxjdx\ instead of dx\jdxy Equation 
3.11 is taken as the definition of a covariant vector with the gradient as the 
prototype. 
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In cartesian coordinates 


dxj _ dx[ _ 
dx'i dxj QlJ 


(3.12) 


and there is no difference between contravariant and covariant transformations. 
In other systems Eq. 3. 12 in general does not apply, and the distinction between 
contravariant and covariant is real and must be observed. This is of prime 
importance in the curved Riemannian space of general relativity. A much 
simpler example is provided by the oblique coordinates of Section 4.4. 

In the remainder of this section the components of a contravariant vector 
are denoted by a superscript, A \ whereas a subscript is used for the components 
of a covariant vector A { . 2 


Definition of Second-rank Tensors 

To remove some of the fear and mystery from the term tensor, let us rechristen 
a scalar as a tensor of rank zero and relabel a vector as a tensor of first rank. 
Then we proceed to define contravariant, mixed, and co variant tensors of 
second rank by the following equations : 

J'ij = y ^ x i ^J jkl 
ki 8x k dx t 


n'i _ Y dx'i dx t 
' i 1dx k 8x] " 

v dx k dx t 
i} tidx\ dxj «• 


(3.13) 


Clearly, the rank goes as the number of partial derivatives (or direction cosines) 
in the definition: zero for a scalar, one for a vector, two for a second-rank 
tensor, and so on. Each index (subscript or superscript) ranges over the number 
of dimensions of the space. The number of indices (rank of tensor) is indepen- 
dent of the dimensions of the space. We see that A kl is contravariant with 
respect to both indices, C kl is covariant with respect to both indices, and B k l 
transforms contra variantly with respect to the first index k but covariantly with 
respect to the second index /. Once again, if we are using cartesian coordinates, 
all three forms of the tensors of second rank, contravariant, mixed, and co- 
variant are the same. 

As with the components of a vector, the transformation laws for the com- 
ponents of a tensor, Eq. 3.13, yield entities (and properties) that are indepen- 
dent of the choice of reference frame. This is what makes tensor analysis im- 
portant in physics. The independence of reference frame (invariance) is ideal 
for expressing and investigating universal physical laws. 


2 This means that the coordinates (x ,y,z) should be written (x 1 ,* 2 ,* 3 ) since 

r transforms as a contravariant vector. Because we shall shortly restrict our 
attention to cartesian tensors (where the distinction between contravariance 
and covariance disappears) we continue to use subscripts on the coordinates. 
This avoids the ambiguity of x 2 representing both x squared and y . 
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The second-rank tensor A (components A kl ) may be conveniently repre- 
sented by writing out its components in a square array (3 x 3 if we are in 
three-dimensional space), 



This does not mean that any square array of numbers or functions forms a 
tensor. The essential condition is that the components transform according to 
Eq. 3.13. 

In the context of matrix analysis the preceding transformation equations 
become (for cartesian coordinates) an orthogonal similarity transformation, 
Section 4.3. A geometrical interpretation of a second-rank tensor (the inertia 
tensor) is developed in Section 4.6. 


Addition and Subtraction of Tensors 

The addition and subtraction of tensors is defined in terms of the individual 
elements just as for vectors. To add or subtract two tensors, we add or subtract 
the corresponding elements. If 

A+B = C, (3.15) 

then 

A ij -f B ij = C ij . 


Of course, A and B must be tensors of the same rank and both expressed in a 
space of the same number of dimensions. 


Summation Convention 

In tensor analysis it is customary to adopt a summation convention to put 
Eq. 3.13 and subsequent tensor equations in a more compact (and, for the be- 
ginning student, a more obscure) form. As long as we are distinguishing between 
contravariance and covariance, let us agree that when an index appears on one 
side of an equation, once as a superscript and once as a subscript, we auto- 
matically sum over that index. Then we may write the second expression in 
Eq. 3.13 as 


B 


j 


dx'i dXj B k . 
dx k dx'j 1 ’ 


(3.16) 


with the summation of the right-hand side over k and / implied. This is the sum- 
mation convention. 3 

To illustrate the use of the summation convention and some of the techniques 
of tensor analysis, let us show that the now familiar Kronecker delta, 5 kh is 
really a mixed tensor of second rank, S k . 4 The question is, does transform 


3 In this context dx[jdx k might better be written as a\ and dxjdx] as b l j . 

4 It is common practice to refer to a tensor A by specifying a typical component. 
Ay. As long as the reader refrains from writing nonsense such as A = Ay, no 
harm is done. 
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according to Eq. 3.13? This is our criterion for calling it a tensor. We have, 
using the summation convention, 

1 8x k dxj dx k dxj 

by definition of the Kronecker delta. Now 

dx^dx^^dx^ ( 3 18 ^ 

dx k dx'j dx'j 

by direct partial differentiation of the right-hand side (chain rule). However, 
x- and x'j are independent coordinates, and therefore the variation of one with 
respect to the other must be zero if they are different, unity if they coincide; 
that is, 

I H‘ <3 ' l9) 

Hence 


_ d x i Sxi 
1 dx k dx'j 


5 


k 

l 9 


showing that the Sf are indeed the components of a mixed second-rank tensor. 
Notice that this result is independent of the number of dimensions of our space. 

The Kronecker delta has one further interesting property. It has the same 
components in all of our rotated coordinate systems and is therefore called 
isotropic. In Section 3.4 we shall meet a third-rank isotropic tensor and three 
fourth-rank isotropic tensors. No isotropic first-rank tensor (vector) exists. 


Symmetry — Antisymmetry 

The order in which the indices appear in our description of a tensor is im- 
portant. In general, A mn is independent of A nm , but there are some cases of 
special interest. If 

A mn = A nm ? ( 3 . 20 ) 

we call the tensor symmetric. If, on the other hand, 

A mn = —A nm , (3.21) 

the tensor is antisymmetric. Clearly, every (second -rank) tensor can be resolved 
into symmetric and antisymmetric parts by the identity 

A mn = i(A mn +A nm )+ \{A mn - A nm ), (3.22) 

the first term on the right being a symmetric tensor, the second, an antisym- 

metric tensor. This resolution into symmetric and antisymmetric tensors will 
reappear in the theory of elasticity (Section 3.6). A similar resolution of func- 
tions into symmetric and antisymmetric parts is of extreme importance to 
quantum mechanics. 
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Spinors 

It was once thought that the system of scalars, vectors, tensors (second- 
rank), and so on formed a complete mathematical system, one that is adequate 
for describing a physics independent of the choice of reference frame. But the 
universe (and mathematical physics) are not this simple. In the realm of elemen- 
tary particles, for example, spin zero particles 5 (71 mesons, a particles) may be 
described with scalars, spin 1 particles (deuterons) by vectors, and spin 2 
particles (hypothetical gravitons) by tensors. This listing omits the most com- 
mon particles: electrons, protons, and neutrons, all with spin j. These particles 
are properly described by spinors . A spinor is not a scalar, vector, or tensor. A 
brief introduction to spinors in the context of group theory appears in Section 
4 . 10 . 


EXERCISES 

3.1 .1 Show that if the components of any tensor of any rank vanish in one particular 
coordinate system they vanish in all coordinate systems. 

Note. This point takes on especial importance in the four-dimensional curved space 
of general relativity. If a quantity, expressed as a tensor, exists in one coordinate 
system, it exists in all coordinate systems and is not just a consequence of a choice 
of a coordinate system (as are centrifugal and Coriolis forces in Newtonian 
mechanics). 

3.1 .2 The components of tensor A are equal to the corresponding components of tensor 
B in one particular coordinate system ; that is, 

A?j = Bfj 

Show that tensor A is equal to tensor B , A i} ~ B ip in all coordinate systems. 

3.1.3 The first three components of a four-dimensional vector vanish in each of two 
reference frames. If the second reference frame is not merely a rotation of the first 
about the x 4 axis, that is, if at least one of the coefficients a i4 (i = 1, 2, 3) =f= 0, show 
that the fourth component vanishes in all reference frames. Translated into rela- 
tivistic mechanics this means that if momentum is conserved in two Lorentz frames, 
then energy is conserved in all Lorentz frames. 

3.1 .4 From an analysis of the behavior of a general second-rank tensor under 90° and 
180° rotations about the coordinate axes, show that an isotropic second-rank 
tensor in three-dimensional space must be a multiple of 3^. 

3.1.5 The four-dimensional fourth-rank Riemann-Christoffel curvature tensor of 
general relativity, R ik i m , satisfies the symmetry relations 

Riklm — Rikml ~ ~ ^kilm' 

With the indices running from 1 to 4, show that the number of independent com- 
ponents is reduced from 256 to 36 and that the condition 

Rikim = Rlmik 


5 The particle spin is intrinsic angular momentum (in units of h). It is distinct 
from classical, orbital angular momentum due to motion. 
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further reduces the number of independent components to 21. Finally, if the com- 
ponents satisfy an identity R iklm + R ilmk + R imkl = 0, show that the number of 
independent components is reduced to 20. 

Note. The final three-term identity furnishes new information only if all four 
indices are different. Then it reduces the number of independent components by 
one third. 

3 . 1.6 T iklm is antisymmetric with respect to all pairs of indices. How many independent 
components has it (three-dimensional space)? 


3.2 CONTRACTION, DIRECT PRODUCT 


Contraction 

When dealing with vectors, we formed a scalar product (Section 1.3) by 
summing products of corresponding components : 

A • B = A { B { (summation convention). (3.23) 


The generalization of this expression in tensor analysis is a process known as 
contraction. Two indices, one covariant and the other contravariant, are set 
equal to each other and then (as implied by the summation convention) we sum 
over this repeated index. For example, let us contract the second-rank mixed 
tensor B n j. 


B l 


j 


B'\ 


fa- dx, Bk 

dx k dxl 1 



l 


(3.24) 


by Eq. 3.18 and then by Eq. 3.19 

B'\ = 8\B\ = B\. (3.25) 


Our contracted second -rank mixed tensor is invariant and therefore a scalar. 1 
This is exactly what we obtained in Section 1 .3 for the dot product of two vectors 
and Section 1.7 for the divergence of a vector. In general, the operation of con- 
traction reduces the rank of a tensor by 2. An example of the use of contraction 
appears in Section 3.6. 


Direct Product 

The components of a covariant vector (first-rank tensor) a t and those of a 
contravariant vector (first-rank tensor) b j may be multiplied component by 
component to give the general term This, by Eq. 3. 1 3, is actually a second- 
rank tensor, for 


In matrix analysis this scalar is the trace of the matrix. Section 4.2. 



CONTRACTION, DIRECT PRODUCT 12S 


Contracting, we obtain 


4b'‘ = paMb' 

dx t 8x t 


dXk d x j 
dx[ 8x t 


(a k b l ). 


a'ib " = a k b k , 


(3.26) 


(3.27) 


as in Eqs. 3.24 and 3.25 to give the regular scalar product. 

The operation of adjoining two vectors a t and b j as in the last paragraph is 
known as forming the direct product. For the case of two vectors, the direct 
product is a tensor of second rank. In this sense we may attach meaning to VE, 
which was not defined within the framework of vector analysis. In general, the 
direct product of two tensors is a tensor of rank equal to the sum of the two 
initial ranks; that is, 

A)B kl = Cj k \ 

where Cj kl is a tensor of fourth rank. From Eqs. 3.13 

(J'ikl _ d X i 3 X n & X k & X l £mpg 
j dx m dxj dx p dx q 

The direct product appears in mathematical physics as a technique for 
creating new higher-rank tensors. Exercise 3.2.1 is a form of the direct product 
in which the first factor is V. Applications appear in Section 3.7. 

When T is an nth rank cartesian tensor, (d/8Xi)T jkl ..., an element of VT, is a 
cartesian tensor of rank n + 1 (Exercise 3.2.1). However, (3/3x f )7} w ... is not a 
tensor under more general transformations. In noncartesian systems d/dx^ will 
act on the partial derivatives dx p /dx q and destroy the simple tensor transforma- 
tion relation. 

So far the distinction between a covariant transformation and a contra- 
variant transformation has been maintained because it does exist in non- 
cartesian space and because it is of great importance in general relativity. In 
Sections 3.8 and 3.9 we shall develop differential relations for noncartesian 
tensors. Now, however, because of the simplification achieved, we restrict our- 
selves to cartesian tensors. As noted in Section 3.1, the distinction between 
contravariance and covariance disappears and all indices are from now on 
shown as subscripts. We restate the summation convention and the operation 
of contraction. 


(3.28) 

(3.286) 


Summation Convention 

When a subscript (letter, not number) appears twice on one side of an equa- 
tion, summation with respect to that subscript is implied. 

Contraction 

Contraction consists of setting two unlike indices (subscripts) equal to each 
other and then summing as implied by the summation convention. 
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EXERCISES 


3.2.1 If T .i is a tensor of rank n , show that dT... i /dx j is a tensor of rank n + 1 (cartesian 
coordinates). 

Note. In noncartesian coordinate systems the coefficients a u are, in general, func- 
tions of the coordinates, and the simple derivative of a tensor of rank n is not a 
tensor except in the special case of n — 0. In this case the derivative does yield a 
covariant vector (tensor of rank 1) by Eq. 3.1 1. 

3.2.2 If T ijk ... is a tensor of rank show that ]Td7 ] jk .../dXj is a tensor of rank n — 1 
(cartesian coordinates). 


3.2.3 The operator 


C 2 dt 2 


may be written as 


£ 


i=l 


il 

dxf’ 


using x 4 — ict. This is the four-dimensional Laplacian, usually called the d’Alem- 
bertian and denoted by n 2 . Show that it is a scalar operator. 


3*3 QUOTIENT RULE 

If A t and Bj are vectors, as seen in Section 3.2, we can easily show that A^Bj 
is a second-rank tensor. Here we are concerned with a variety of inverse rela- 
tions. Consider such equations as 


I! 

* 

(3.29a) 


(3.29b) 

KijAj k — B ik 

(3.29c) 

II 

3 

(3.29 d) 

KijA k — B ijk 

(3.29c) 


In each of these expressions A and B are known tensors of rank indicated by the 
number of indices and A is arbitrary. In each case K is an unknown quantity. We 
wish to establish the transformation properties of K. The quotient rule asserts 
that if the equation of interest holds in all (rotated) cartesian coordinate sys- 
tems, K is a tensor of the indicated rank. The importance in physical theory is 
that the quotient rule can establish the tensor nature of quantities. Exercise 
3.3.1 is a simple illustration of this. The quotient rule (Eq. 329b) shows that the 
inertia matrix appearing in the angular momentum equation L = 7co, Section 
4.6, is a tensor. And Eq. 329d is quoted in Section 3.6 to establish the tensor 
nature of the generalized Hooke’s law “constant” c ijkl . 

In proving the quotient rule, we consider Eq. 329b as a typical case. In our 
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primed coordinate system 

KljA'j = B[ = a ik B k , (3.30) 

using the vector transformation properties of B. Since the equation holds in all 
rotated cartesian coordinate systems, 

a ik^k~ a ik(^kl^l)- (3.31) 

Now, transforming A back into the primed coordinate system 1 (compare Eq. 
3.9), we have 

Kij^'j — a ik^ki a ji^j (3.32) 

Rearranging, we obtain 

(Klj ~ a ik a n K k ^A] = 0. (3.33) 

This must hold for each value of the index i and for every primed coordinate 
system. Since the Aj arbitrary, 2 we conclude 

K[j = a ik a jX K kl , (3.34) 

which is our definition of second-rank tensor. 

The other equations may be treated similarly, giving rise to other forms of 
the quotient rule. One minor pitfall should be noted— the quotient rule does 
not necessarily apply if B is zero. The transformation properties of zero are 
indeterminate. 


EXERCISES 

3 . 3.1 The double summation K^A^Bj is invariant for any two vectors A t and Bj. Prove 
that is a second-rank tensor. 

Note. In the form ds 2 (invariant) = g^dXi dx p this result shows that g 0 -, the ‘'metric” 
is a tensor. 

3 . 3.2 The equation K i} A jk = B ik holds for all orientations of the coordinate system. If 
A and B are second-rank tensors, show that K is a second-rank tensor also. 

3 . 3.3 The exponential in a plane wave is exp[/(k*r — cot)']. We recognize x^ = (x 1 ,x 2 , 

x 3 Jct) as a prototype vector in Minkowski space. If k*r — cot is a scalar under 
Lorentz transformations (Section 3.7), show that ,k 2 , k 3 , ico/c) is a vector 

in Minkowski space. 

Note. Multiplication by h yields (p, iE/c) as a vector in Minkowski space. 


^ote carefully the order of the indices of the direction cosine a jt in this 
inverse transformation. We have 

A[ = y,~^-Aj = Y a»Aj. 

J dx J J 

2 We might, for instance, take A\ — 1 and A' m — 0 for m 1. Then the equa- 
tion — a ik a u K kl follows immediately. The rest of Eq. 3.34 comes from 
other special choices of the arbitrary A j. 
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3.4 PSEUDOTENSORS, DUAL TENSORS 

So far our coordinate transformations have been restricted to pure rotations. 
We now consider the effect of reflections or inversions. If we have transforma- 
tion coefficients a i} — —d ij9 then by Eq. 3.7 

= —x' h (3.35) 

which is an inversion. Note carefully that this transformation changes our 
initial right-handed coordinate system into a left-handed coordinate system. 1 
Our prototype vector r with components (x l9 x 2 ,x 3 ) transforms to r' = 
(x\ , x ' 2 , X3) = ( — x x , — x 2 , —x 3 ). This new vector r has negative components, 
relative to the new transformed set of axes. As shown in Fig. 3.1, reversing the 
directions of the coordinate axes and changing the signs of the components 
gives r' = r. The vector (an arrow in space) stays exactly as it was before the 
transformation was carried out. The position vector r and all other vectors 
whose components behave this way (reversing sign with a reversal of the co- 
ordinate axes) are called polar vectors. 


/ 




FIG. 3.1 Inversion of cartesian coordinates — polar vector 

A fundamental difference appears when we encounter a vector defined as the 
cross product of two polar vectors. Let C = A x B, where both A and B are 
polar vectors. From Eq. 1.33 of Section 1.4 the components of C are given by 

C x = A 2 B 3 -A,B 2 . (3.36) 

and so on. Now when the coordinate axes are inverted, A t — ► ~A' h Bj -+ — B] 
but from its definition C k -> 4- C k ; that is, our cross-product vector, vector C, 
does not behave like a polar vector under inversion. To distinguish, we label it a 
pseudovector or axial vector (see Fig. 3.2). The term axial vector is frequently 
used because these cross products often arise from a description of rotation. 


^his is an inversion of the coordinate system or coordinate axes, objects in 
the physical world remaining fixed. 
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FIG. 3.2 Inversion of cartesian coordinates — axial vector 
Examples are 

angular velocity, 
angular momentum, 
torque, 

magnetic induction field B, 

In v = co x r, the axial vector is the angular velocity to, and r and v — dr/dt are 
polar vectors. Clearly, axial vectors occur frequently in elementary physics, 
although this fact is usually not pointed out. In a right-handed coordinate sys- 
tem an axial vector C has a sense of rotation associated with it given by a right- 
hand rule (compare Section 1.4). In the inverted left-handed system the sense 
of rotation is a left-handed rotation. This is indicated by the curved arrows in 
Fig. 3.2. 

The distinction between polar and axial vectors may also be illustrated by a 
reflection. A polar vector reflects in a mirror like a real physical arrow, Fig. 
3.3 a. In Figs. 3.1 and 3.2 the coordinates are inverted; the physical world re- 
mains fixed. Here the coordinate axes remain fixed; the world is reflected — as 
in a mirror in the xz-plane. Specifically, in this representation we keep the axes 
fixed and associate a change of sign with the component of the vector. For a 
mirror in the xz-plane, P y —P y . We have 

P = (P x ,P y ,P z ) 

P r = ( P x , — P y , P z ). polar vector. 

An axial vector such as a magnetic field H or a magnetic moment \i 
( — current x area of current loop) behaves quite differently under reflection. 
Consider the magnetic field H and magnetic moment to be produced by an 
electric charge moving in a circular path (Exercise 5.8.4 and Example 12.5.1). 


v = o x r 
L = rxp, 
N = rxf, 
SB 


8t 


= -V x E. 
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Reflection reverses the sense of rotation of the charge. The two current loops 
and the resulting magnetic momehts are shown in Fig. 3.3 b. We have 

Jl (/^x> fly) l^z) 

V = (-li X9 fly 9 

If we agree that the universe does not care whether we use a right- or left- 
handed coordinate system, then it does not make sense to add an axial vector to 
a polar vector. In the vector equation A = B, both A and B are either polar 
vectors or axial vectors. 2 Similar restrictions apply to scalars and pseudo- 
scalars and, in general, to the tensors and pseudotensors considered sub- 
sequently. 

Usually, pseudo scalars, pseudo vectors, and pseudotensors will transform as 

S' = \a\S 

c; = \a\UijCj, (3.37) 

A[j — \a\a ik a^A kl , 

where \a\ is the determinant 3 of the array of coefficients a mn . In our inversion the 
determinant is 


-1 0 

\a\=\ 0 -1 

0 0 

For a reflection of one axis, the x-axis, 

1 — 1 0 0 

I a\ = 


0 1 
0 0 


= - 1 . 


(3.38) 


(3.39) 


and again the determinant \a\ = — 1. On the other hand, for all pure rotations 
the determinant \a\ is always -hi. This is discussed further in Section 4.3. Often 
quantities that transform according to Eq. 3.37 are known as tensor densities. 
They are regular tensors as far as rotations are concerned, differing from tensors 
only in reflections or inversions of the coordinates, and then the only difference 
is the appearance of an additional minus sign from the determinant \a\. 

In Chapter 1 the triple scalar product S = A x B • C was shown to be a scalar 
(under rotations). Now by considering the transformation given by Eq. 3.35, we 
see that S — 5, proving that the triple scalar product is actually a pseudo- 
scalar : This behavior was foreshadowed by the geometrical analogy of a volume. 
If all three parameters of the volume, length, depth, and height, change from 
positive distances to negative distances, the product of the three will be negative. 


2 The big exception to this is in beta decay, weak interactions. Here the 
universe distinguishes between right- and left-handed systems, and we add 
polar and axial vector interactions. 

3 Determinants are described in Section 4. 1 . 
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Levi-Civita Symbol 

For future use it is convenient to introduce the three-dimensional Levi- 
Civita symbol s ijk defined by 

S 123 = e 231 “ e 312 = I 5 

£ 132 = 4 S 213> ” £ 321 = —U (3.40) 

all other s ijk = 0 . 

Note that e ijk is totally antisymmetric with respect to all pairs of indices. Sup- 
pose now that we have a third-rank pseudotensor S ijk , which in one particular 
coordinate system is equal to s ijk . Then 

= \a\a ip a jq a kr E pqr (3.41) 

by definition of pseudotensor. Now 

a ip a 2q a 3r E pqr = |a| (3.42) 

by direct expansion of the determinant, showing that d[ 22 , — \ a \ 2 — 1 = £123- 
Considering the other possibilities one by one, we find 

^ ( 3 * 43 ) 

for rotations and reflections. Hence s ijk is a pseudotensor . 4,5 Furthermore, it is 
seen to be an isotropic pseudotensor with the same components in all rotated 
cartesian coordinate systems. 


Dual Tensors 

With any antisymmetric second-rank tensor CL (in three-dimensional space) 
we may associate a dual pseudovector C t defined by 


Q ~ 2 f'ijk Cj k • 


Here the antisymmetric C jk may be written 



(3.44) 


(3.45) 


We know that Q must transform as a vector under rotations from the double 
contraction of the fifth-rank (pseudo) tensor £ ijk C„ tn but that it is really a 


4 The usefulness of s ijk extends far beyond this section. For instance, the 
matrices M k of Exercise 4.2.16 were derived from ( M k ) u = — ie tJk . Much of 
elementary vector analysis can be written in a very compact form by using 
e ijk and the identity of Exercise 3.4.4. See Evett, A. A. “Permutation Symbol 
Approach to Elementary Vector Analysis.” Am. J. Phys. 34 , 503 (1966). 

5 The numerical value of e ijk is given by the triple scalar product of coordinate 
unit vectors: 

Ve, x e fc . 

From this point of view each element of e ijk is a pseudoscalar, but the E iik 
collectively form a third -rank pseudotensor. 
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pseudovector from the pseudo nature of e ijk . Specifically, the components of 
C are given by 

(C 1 ,C 2 ,C 3 ) = (C 2 3,C 31 ,C 12 ). (3.46) 


Notice the cyclic order of the indices that comes from the cyclic order of the 
components of e ijk . This duality, given by Eq. 3.46, means that our three- 
dimensional vector product may literally be taken to be either a pseudovector 
or an antisymmetric second-rank tensor, depending on how we chose to write 
it out. 

If we take three (polar) vectors A, B, and C, we may define 


V m 


A t B t q 
Aj Bj Cj 

Au B h C h 


(3.47) 


— AiBjC k A t B k Cj + ■ * * . 


By an extension of the analysis of Section 3.1 each term A p B q C r is seen to be a 
third-rank tensor, making Yak a tensor of third rank. From its definition as a 
determinant V ijk is totally antisymmetric, reversing sign under the interchange 
of any two indices, that is, the interchange of any two rows of the determinant. 
The dual quantity is 


V = 


3 ! B ‘ Jk 


V m , 


clearly a pseudoscalar. By expansion it is seen that 


F = 


A , 
^2 
^3 


B i 

b 2 

B, 


Ci 

c 2 

c 3 


(3.48) 


(3.49) 


our familiar triple scalar product. 

For use in writing of Maxwell’s equations in covariant form, Section 3.7, we 
want to extend this dual vector analysis to four-dimensional space and, in 
particular, to indicate that the four-dimensional volume element, dx x dx 2 dx 3 
is a pseudoscalar. 

We introduce the Levi-Civita symbol s ijkl , the four-dimensional analog of 
s ijk . This quantity s ijki is defined as totally antisymmetric in all four indices. If 
(ijkl) is an even permutation 6 of (1, 2, 3, 4), then e ijk i is defined as + 1 ; if it is an 
odd permutation, then s ijk i is — 1. The Levi-Civita s ijkl may be proved a pseudo- 
tensor of rank 4 by analysis similar to that used for establishing the nature of 
s ijk . Introducing a fourth-rank tensor. 


6 A permutation is odd if it involves an odd number of interchanges of adjacent 
indices such as (1 2 3 4) -»(1 3 2 4). Even permutations arise from an even 
number of transpositions of adjacent indices. (Actually the word “adjacent” 
is not necessary.) 
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H\ ijkl — 


At A Q A 


A J 


Bj 


Cj 


A 


A k At Q 
Ai A Q A, 

built from the polar vectors A, B, C, and D, we may define the dual quantity 


(3.50) 




(3.51) 


We actually have a quadruple contraction which reduces the rank to zero. From 
the pseudo nature of s ijkh H is a pseudoscalar. Now we let A, B, C, and D be 
infinitesimal displacements along the four coordinate axes (Minkowski space), 


and 


A = (dx x , 0,0,0) 

B = (0,<fx 2 ,0,0), and so on. 


(3.52) 


H = dx x dx 2 dx 3 dx A , (3 . 53) \ 

The four-dimensional volume element is now identified as a pseudoscalar. We 
use this result in Section 3.7. This result could have been expected from the 
results of the special theory of relativity. The Lorentz- Fitzgerald contraction 
of dx x dx 2 dx 3 just balances the time dilation of dx 4 . 

We slipped into this four-dimensional space as a simple mathematical exten- 
sion of the three-dimensional space and, indeed, we could just as easily have 
discussed 5-, 6-, or A-dimensional space. This is typical of the power of the 
component analysis. Physically, this four-dimensional space is usually taken as 
Minkowski space. 


(*i , x 2 , x 3 , x 4 ) = (x, y, z, ict ), (3.54) 

where t is time. This is the merger of space and time achieved in special rela- 
tivity. The transformations that describe the rotations in four-dimensional 
space are the Lorentz transformations of special relativity. We encounter these 
Lorentz transformations in Sections 3.7 and 4.13. 


Irreducible Tensors 

For some applications, particularly in the quantum theory of angular mo- 
mentum, our cartesian tensors are not particularly convenient. In mathematical 
language our general second-rank tensor A tj is reducible, which means that it 
can be decomposed into parts of lower tensor rank. In fact, we have already 
done this. From Eq. 3.25 

A = A u (3.55) 

is a scalar quantity, the trace of A iy 


7 An alternate approach, using matrices, is given in Section 4.3 (see Exercise 
4.3.9). 
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The antisymmetric portion 

B<j = - A Jt ) (3.56) 

has just been shown to be equivalent to a (pseudo) vector, or 

Bij = C k cyclic permutation of i,j, k. (3.57) 

By subtracting the scalar A and the vector C k from our original tensor, we have 
an irreducible, symmetric, zero-trace second-rank tensor, S i} , in which 

S lj = i(A lj + A j d-iAS lj9 (3.58) 

with five independent components. Then, finally, our original cartesian tensor 
may be written 

Av = $A5v + C k + S„. (3.59) 

The three quantities A, C k , and S tj form spherical tensors of rank 0, 1, and 2, 
respectively, transforming like the spherical harmonics Y ™ (Chapter 12) for 
L = 0, 1, and 2. Further details of such spherical tensors and their uses will be 
found in the book by Rose, cited in Chapter 12. 

A specific example of the preceding reduction is furnished by the symmetric 
electric quadrupole tensor 



r 2 $ ij )p{x 1 ,x 2 ,x 3 )d 3 x. 


The ~r 2 Sij term represents a subtraction of the scalar trace (the three i = j 
terms). The resulting Q {j has zero trace. The strain tensor of Section 3.6 is 
another example of this reduction (See Exercise 3.6.7). 


EXERCISES 


3.4.1 An antisymmetric square array is given by 

/ 0 C 3 — C 2 \ / 0 C 12 C 13 \ 

( -c 3 o c x U -c 12 0 c 23 , 

\c 2 ~C i 0/ \-C 13 -C 23 0 / 

where (C t , C 2 , C 3 ) form a pseudovector. Assuming that the relation 

Ci = jfijkCjk 

holds in all coordinate systems, prove that C jk is a tensor. (This is another form 
of the quotient theorem). 

3.4.2 Show that the vector product is unique to three-dimensional space, that is, only 
in three dimensions can we establish a one-to-one correspondence between the 
components of an antisymmetric tensor (second-rank) and the components of a 
vector. 
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3.4.3 

3.4.4 

3.4.5 

3.4.6 

3.4.7 

3.4.8 

3.4.9 

3.4.10 

3.4.11 

3.4.12 


Show that 
(a) <5 if = 3, 

O) <5y£y t = 0, 

(c) ^ipq^jpq ' 

(d) e iJk e iJk = 6. 

Show that 


^ijk^pqk ^ip^jq ^iq^jp' 


(a) Express the components of a cross-product vector C, C = A x B, in terms 
of s ijk and the components of A and B. 

(b) Use the antisymmetry of e ijk to show that A • A x B = 0. 

ANS. (a) C x — s ijk AjB k . 


(a) Show that the inertia tensor (matrix) of Section 4.6 may be written 

hi = m{x n x n di j - XiXj) 

for a particle of mass m at (.^,.* 2 , * 3 )- 

(b) Show that 

hi = -Af u M„= -rne ilk x k e lJm x m , 

where M u = m 1/2 e, Uk x k . This is the contraction of two second-rank tensors 
and is identical with the matrix product of Section 4.2. 


Write V • V x A and V x Vcp in e ijk notation, so that it becomes obvious that each 
expression vanishes. 


ANS. 


•V x A = £ ijk jr-~A k 


AA 

K dx i dxj 


,X7 X7 \ 8 3 

(V x Vtp); = e iJk ~ - <p. 

dxj dx k 


Expressing cross products in terms of Levi-Civita symbols (£ ijfc ), derive the BAC- 
CAB rule, Eq. 1.50. 

Hint. The relation of Exercise 3.4.4 is helpful. 

Verify that each of the following fourth-rank tensors is isotropic, that is, it has 
the same form independent of any rotation of the coordinate systems. 

(a) A ijkl = SjjS kh 

(b) B ijkl = d ik 6j t 4- S u S jk q 

( c ) Qjki — dik^ji ~ Ai^jk- 


Show that the two index Levi-Civita symbol £ tj is a second-rank pseudotensor (in 
two-dimensional space). Does this contradict the uniqueness of 4 (Exercise 
3.1.4)? 


(a) Represent by a 2 x 2 matrix, and using the 2 x 2 rotation matrix of 
Section 4.3, show that £ fj is invariant under orthogonal similarity trans- 
formations. 


0 

0 -1 


(b) Demonstrate the pseudo nature of £,-,- by using 
matrix. 

Given A k — with B tj — — B jh antisymmetric, show that 


as the transforming 
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3 . 4. 1 3 Show that the vector identity 

(A x B) • (C x D) = (A • C) (B • D) — ( A • D) (B • C) 

(Exercise 1.5.12) follows directly from the description of a cross product with 
e ijk and the identity of Exercise 3.4.4. 


3.5 DYADICS 


Occasionally, particularly in the older literature and older textbooks, the 
reader will see references to dyads or dyadics. The dyadic is a somewhat clumsy 
device for extending ordinary vector analysis to cover tensors of second rank. 

If we adjoin two vectors i and j to form the combination ij, we have a dyad. 
Multiplication (scalar or vector) from the left involves the left-hand member of 
the pair and leaves the right-hand member strictly alone : 

A-ij = [(L4 x + j/4, + k/4 2 )-i]j ^ 

= A J. 


Multiplication from the right is just the reverse ; that is, 
ij • A = i[j • (L 4 X + j A y + kA z )] 
= iA y . 


(3.61) 


From this we see that, in general, the operation of multiplication is non- 
commutative. It must be emphasized strongly that the i and j of the dyad ij are 
not operating on each other. If they had scalar coefficients, these would be 
multiplied together, but as far as the unit vectors are concerned there is no dot 
or cross product involved; they are just sitting there. As just shown, the order 
is significant ij =f= ji. We thus have a composite quantity that depends in part on 
the ordering. This dependence on ordering will reappear when we study matrices 
(Chapter 4) and complex quantities (Chapter 6), the complex number being 
literally an ordered pair of real numbers. 

Extending this construction, we adjoin two vectors A and B to form 


T — AB = (iA x 4* ]A y 4 \Jl z )(\B x 4 j B y 4 k B z ) 


= it 4 X B X 4 \\A X B V 4- \kA x B z 

(3.62) 

4 jL 4 y B x 4 jj A y B y 4 }kA y B z 
4 k\A z B x 4- kj A z B y 4- kk A Z B Z . 

The quantity T = AB is a dyadic formed as shown from a combination of dyads. 
We have proved (Section 3.2) that this product of two vectors AB is a tensor of 
second rank. Hence, dyadics are tensors of second rank, written in a form that 
preserves the vector nature but obscures the tensor transformation properties. 

It has already been noted that the multiplication of a vector and a dyadic is 
not commutative, but there is an important special case in which the operation 
is commutative. We take the dyadic AB and set 
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a • AB — AB • a, 

where a is an arbitrary vector. If a = i, then 

A X B = A B x 

+ }A x B y + k A X B Z = L 4 X B X -h j A y B x + k A Z B X . 
By equating components, we obtain 


(3.63) 


(3.64) 


A X B X = A X B X , 

A x B y =A y B x , (3.65) 

A~ X B 2 = A z B x , 

showing that A = cB, in which c is a constant. In other words, if multiplication 
with an arbitrary vector is commutative, the dyadic must be symmetric, and the 
coefficient of dyad pq equals the coefficient of dyad qp. Conversely, if the dyadic 
is symmetric, multiplication is commutative. 

One of the most significant properties of a symmetric dyadic is that it can 
always be put in normal or diagonal form by proper choice of the coordinate 
axes: 

+ jj T yy (3.66) 

+ kk T zz , 

all the nondiagonal coefficients going to zero. The coordinate transformation 
that puts our dyadic in this diagonal form is known as the principal axis trans- 
formation. It is discussed at some length in Section 4.6. 

There is an interesting and useful geometric interpretation of a symmetric 
dyadic. For simplicity let us assume that our symmetric dyadic T is already in its 
diagonal form. Then with r, the usual distance vector, we form the equation 

r * T* r = 1, (3.67) 


which limits the length of r according to its orientation. By expanding Eq. 3.67, 
we have 


(Lv + j y + k z) • (\\T XX + jjr w . + kkr zz ) • (Lx + yy + kz) = 1 

(3.68) 

x 2 T xx + y 2 T yy + z 2 T zz = 1 . 

If T xx ^ 0, B yy ^ 0, and T zz ^ 0, then Eq. 3.68 defines an ellipsoid with semiaxes 
a , b , and c given by 

a = T xx ' 2 , b = T~ m , c = T~ z a . (3.69) 

For the inertia tensor of Section 4.6 these diagonal elements are clearly positive 
from their definition (Eq. 4.139). Diagonalizing our dyadic corresponded to 
orienting the dyadic ellipsoid so that the ellipsoid axes were lined up with the 
coordinate axes. 

If U is an antisymmetric dyadic; 
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U xx = 0, and so on, 

U xy = — U yx , and so on, 

then for any vector a 

a- U = — LI - a. (3.70) 

Multiplication of a vector and an antisymmetric dyadic follows an anticom- 
mutation rule (See Exercise 3.5.4a). 

Dyadics are rather awkward to handle in comparison with the usual tensor 
analysis (once the concept of transformation under coordinate rotation has been 
absorbed). They are quite unwieldy for representing third- or higher-rank 
tensors, so we shall return to tensor analysis and have nothing further to do with 
dyadic notation. 


EXERCISES 


3.5.1 If A and B transform as vectors, Eqs. 3.6 and 3.8, show that the dyadic AB 
satisfies the tensor transformation law, Eq. 3.13. 

3.5.2 Show that I = ii 4- jj + kk is a unit dyadic in the sense that for any vector V 

IV- V. 

The individual dyads ii, and so on are specific examples of the projection opera- 
tors of quantum mechanics. 

3.5.3 Show that Vr is equal to the unit dyadic, I. 


3.5.4 If U is an antisymmetric dyadic and V a vector, show that 

(a) V*U=-UV 

(b) VUV-0. 


3.5.5 The two-dimensional vectors r — ix + jj’ and t — —y\ 4- jx may be related by the 
tensor equation r • U = t. 

(a) Find the tensor U, using our earlier component description of tensors. 

(b) Find U and treat it as a dyadic. 

3.5.6 In an investigation of the interaction of molecules a dyadic is formed from the 
unit relative distance vectors e 12 given by 


For 



U — I 3e 12 e 12 


show that trace U * U = 6. 

I is the unit dyadic; that is I = ii -f jj + kk. 


3.5.7 Show that Gauss’s theorem holds for dyadics, that 


dt i-D = 


V-Ddi. 
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3.5.8 Show that 

f dt r*VE+ [da x (V x E) - f </<rV‘E = 0. 

%) s %1 s %) s 

The function E is a vector function of position. The integration is over a simple 
closed surface. This improbable combination of surface integrals actually appears 
in the vector Kirchhoff diffraction theory. 

3.5.9 Show that the following zero-trace, symmetric, unit tensors 

t° = (2kk - ii - jf)A/6 
t ±! = + i(ik + ki) - i}( jk + kj) 
t ±2 = j(ii - jj) ± ij(i ij + ji) 
satisfy the double contraction relation 

t™*t ?. = S 

ij y ^mn ' 

These unit tensors are used in defining tensor spherical harmonics, an extension 
of the vector spherical harmonics of Section 12.11. The tensor spherical har- 
monics, in turn, are helpful in describing gravitational waves. 

3.5.10 A mass m is being pulled slowly (zero acceleration) up an incline (angle 0 from 
horizontal). The usual coefficient of friction is ji\f— fiN, where N is the normal 
force. Define a coordinate system and rewrite this scalar friction equation as a 
vector equation replacing the scalar (x by a dyad. 

3.5. 1 1 The combination of two vectors AB forms a dyadic but not every dyadic can be 
resolved into two vectors. Show that the two-dimensional dyadic 

A = ij - ji 

cannot be resolved into two two-dimensional vectors. 


3.6 THEORY OF ELASTICITY 

When an elastic body is subjected to an external force or stress, it becomes 
deformed or strained. Our study of elasticity in terms of tensors falls naturally 
into three parts: (1) a description of the strain or deformation of the elastic 
substance, (2) a description of the force or stress that produces the deformation, 
and (3) a generalized Hooke’s law in tensor form, relating stress and strain. 

Elastic Strain: Deformation 

The deformation of our elastic body may be described by giving the change 
in relative position of the parts of the body when the body is subjected to some 
external stress (Fig. 3.4). Consider a point P 0 at position r relative to some 
fixed origin and a second point Q 0 displaced from P 0 by a distance 3x. In the 
unstrained state the coordinates of Q 0 relative to P 0 are 5x { \ in the strained 
state, when P 0 has been displaced a distance u to point P x and Q 0 a distance v 
to <2i> the coordinates of Q x relative to P x are 5y t — Sx t + <5w f . The change in 
position of Q relative to P is just 3u t . Neglecting second- and higher-order 
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differentials, 1 a three-dimensional Taylor expansion (Eq. 5.109) yields 
<5u = u(r + <5x) — u(r) = (<Sx • V)u, 

(3.71) 

Sut = ~Sx k . 
dx k 

Since u t is the component of a vector, du t ldx k is an element of a second-rank 
tensor, Vu. Resolving this tensor into symmetric and antisymmetric terms, we 
have 


du i — 


1 / duj 
2\dx k 


+ 


duj 

dxi 


Sx k -~ 
k 2 




$ X k — r hk^ X k — %ik ^ X k‘ 


(3.72) 


The antisymmetric part, £ ik , may be identified as a pure rotation (and not a 
deformation at all). From Section 3.4 we may associate an axial vector J; with 

£, = iV x u. (3.73) 


The displacement 5u corresponding to the antisymmetric part — £ ik dx k becomes 

<5u — x u) x Sx. (3.74) 

This is a rotation about an instantaneous axis through P 0 in the direction 
(V x u) through |V x u| radians. Equation 3.74 is the time integral of v(/) = 
c o(t) x r. 


x The limitation to first-order terms is a rather severe limitation implying 
relative strains of no more than perhaps 1 percent in actual application. 



142 TENSOR ANALYSIS 


The remaining symmetric part of our tension rj tj is taken as a pure strain 
tensor. The diagonal elements (rj 11 , 7722 ^ 33 ) of rj ik represent stretches, whereas 
the nondiagonal elements represent shear strains. 2 

This may be seen by considering Q 0 to be displaced from P 0 along the x-axis ; 
<5x = ibx x . From Eq. 3.72 

bu x = rj lx 8x 1 , 

bu 2 —■ rjn (3.75) 

du$ ~ rj 31 bx x . 

Hence the displacement in the strained case is 

by x ~ Sx l + bu x — (1 + r} ll )8x i , 

by 2 — bu 2 = tj 21 bx x , (3.76) 

bh = = *l3ibx 1 . 

For our initial displacement, bx = i bx x , the diagonal term t] xl contributes to 
the 1 component of by (stretching) and rj 21 and rj 31 contribute to by 2 and by 3 , 
respectively, representing shears. 

Stress-force 

The stresses or forces must be defined carefully. Referring to Fig. 3.5, which 
shows a differential volume, we see that the force in the x r direction acting on 
the surface dA whose normal is in the Xj-direction is P i} dA. The s themselves 
are actually “pressures” in the sense of being force/area. Whenever the terms 
stress or force are used, it is understood that the Py are to be multiplied by the 
appropriate differential area. These are the forces acting on the small parallel- 
epiped of Fig. 3.5. For clarity only the forces on the front three faces are shown. 
If we assume that the stresses are homogeneous, the forces on the opposite 
faces will be reversed in sign as shown in Fig . 3.6. Note that P 2 \ is the shear 
(in tfie x 2 -direction) applied to face B. For a homogeneous force face A must 
apply the same shearing stress P 21 to the outside medium . The stress applied to 
face A by the outside medium is just the reverse, or P 2l directed downward 
(in the -x 2 -direction). Built into this argument are three assumptions that 
should be noted explicitly. 

1. Homogeneous stress throughout the body. 

2. Existence of static equilibrium. 

3. Absence of body forces (such as gravity acting on 
the mass within the parallelepiped) and body torques 
(external magnetic field acting on magnetic do- 
mains). 

These assumptions permit placement of a further restriction on the P s. 
Consider the net torque on the parallelepiped shown in Figs. 3.5 and 3.7 about 
the x 3 -axis. The normal pressures P n exert no net torque. The shearing stresses 


2 Clearly, for an ordinary liquid or a gas (which cannot support shear strains), 
the nondiagonal elements must vanish. 
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P 31 and ^3 2 have zero moment arm. The shearing stresses P i3 and ^23 are 
balanced by equal and opposite stresses on the bottom face (x 3 = 0). The 
remaining torques are 

P 2 ^ (clx 2 dx 2 j dx j 

and 

P 12 (dx 1 dx 3 )dx 2 , (3.77) 

which must balance, 

P 21 dx x dx 2 dx 3 = P l2 dx i dx 2 dx 3 

in the absence of rotation about the x 3 -axis. We have 

P 21 = Pi 2 (3.78) 

and, in general, repeating the argument for the absence of rotation about x 2 
and x 3 , we get 

Pij^Pji- (3.79) 

Thus the array of stresses (pressure) P tj is symmetric. These are equalities of 
magnitude, not direction, which is given by the first index. Now we show that 
this array is a tensor. We form an infinitesimal tetrahedron with a slant face of 
area dA , normal in the xj-direction, as shown in Fig. 3.8. The forces on the 
slant face are P^dA. The forces on the faces x l = 0, x 2 = 0, and x 3 = 0 are, 
respectively, 
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*1 

FIG. 3.8 Differential tetrahedron — balance of forces 



*2 


Pmi(a n dA), 
P ml( a j2 dA), 


P m3 (a j3 dA), 

where a jk dA is the area of the face x k = 0, given by the slant area dA projected 
onto the plane x k — 0. Our a jk is the usual direction cosine, the cosine of the 
angle between the Xj- and the x*-axes. 

The force P ml a jl dA is along x m . Its component in the x--direction is 
a im a n P ml dA (no summation). If we now sum over m, the foregoing expression 
gives us the sum of the x--components of the three forces on the back face, 
x x = 0, in the ^-direction. Finally, summing over all three faces, x k = 0, the 
total force along x\ is 

a im a jk P mk dA = P-jdA (3.80) 

for static equilibrium. Since the area dA is arbitrary, we have 

P'a = a im a jk P mk , (3.81) 

which by definition makes Pfnk a tensor. 

We note that the strain tensor ^ is found to be a tensor by an essentially 
mathematical argument, independent of any physics. P ij9 in contrast, is shown 
to be a tensor by physical arguments (equilibrium) that lead directly to the 
definition of tensor. 
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Stress-strain Relations: Hooke's Law 
First, let us assume an isotropic elastic solid. Later we shall return to the 
general anisotropic case. Consider a uniform rod parallel to the x r axis. 3 Now 
let us investigate the effect on the length of the rod of small tensile stresses P x x , 
P 2 2 , and ^33 acting separately. By applying a small tensile stress P xx , we obtain 

Et hi =- p u> (3.82a) 

where E is Young’s modulus. Applying a small tensile stress P 2 2 , we expect a 
contraction along the x-axis and so write 

Et] 11 = —oP 22 , (3.82 b) 

the minus sign indicating contraction, <x is Poisson’s ratio. A similar equation 
would hold for P 33 . The effects of P xx , P 22 , and P 33 together become 

i&/n= Pu- oP 22 -<rP 33 . (3.83) 

All through here we are limiting ourselves to small stresses and small strains so 
that the stress-strain relation will be linear. Equation 3.83 may be rewritten 

Erin = (1 + °) p n - (7 (P xx 4- P 22 4- P 33 ), (3.84) 

and similarly for Eq 22 and Erj 33 . 

Now rjij and P^ are tensors as proved earlier in this section. Because of the 
symmetry of our system their nondiagonal components are zero. To find the 
generalization of Eq. 3.84 in an arbitrarily oriented cartesian coordinate 
system, we rotate the axes 

tjij = a ik a jk 1 lkk 9 p 

Pij ^ik^jkPkk‘ 

If we multiply Eq. 3.84 by a n a ji9 the corresponding equation for Erj 22 by 
a i2 a j2 , and the equation for Erj 33 by ^ i3 ^ j3 and add ail three equations, we 
obtain 

Ea ik a jk Yikk = (I + <? )a ik a jk P kk - o{P nn )a ik a jk . (3.86) 

Using Eqs. 3.85 and 3.18, we get 

Erjij = ( 14 - <7)P; } ~ o(Pnm) &ij, ( 3 - 87 ) 

where 

(P mm ) = (Pnn) = Pii + P22 + E ^ ( 3 - 88 ) 

the contracted (and therefore invariant) tensor P ijt 

It is frequently more convenient to solve for the stresses P ik . We may do 
this by setting i = j in Eq. 3.87 and contracting 


3 This special choice will start us off in a system identified in Chapter 4 as 
the principal axis system, the particular coordinate system in which the 
shearing strains vanish. 
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Etijj = (1 + <?)Pjj ~ 3 <rPjj 
= (1 - 2(7 )Pjj, 


(3.89) 


dropping the primes as superfluous. Substituting back into Eq. 3.87, we have 


F (7 

(1 + a)P j = Erjij + 


(3.90) 


or 


Pij -f" 


where X and ju, known as Lame’s constants, are given by 

- _ oE E 

~(l+<r)(l-2ff)’ /i_ 2(l+<7)- 


(3.91) 


(3.92) 


The constant /x may be identified as the rigidity or shear modulus. Consider a 
parallelepiped fixed to the (x 3 = 0)-plane with a tangential stress P l2 applied. 
The displacement (<5u of Fig. 3.9) is {rfx 2 , 0,0). In terms of our strain tensor, 
Eq. 3.72, rfij = 0 except for rj l2 = ^ 2 i = From Eq. 3.91 

^12 = 2 n-\rj = tiri, (3.93) 

showing that ju is the ratio of shear stress to shear strain rj. 


*2 



FIG. 3.9 Shear stress — shear strain 


If the strain is spherically symmetric, as in hydrostatic pressure, 

? 7ll ~ Y lll ~ ^33 j ^12 = */ 1 3 = ^23 = 0- 
Then Eq. 3.91 becomes 

Pn = 

= 3fa/ u , 

where 


(3.94) 


(3.95) 


/r — X *F (3.96) 

Since 3 rj xl is the relative change in volume to first order, we identify k as the 
bulk modulus. 
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Generalized Hooke's Law 

For the general case covering anisotropic as well as isotropic solids we 
express the linear stress-strain relation by a generalized Hooke’s law. 

Pij — c ijkl r lkl 5 (3.97) 

where c ijkl is a fourth-rank tensor by the quotient rule, Eq. 3.29 d. Since the 
stress tensor P tj and the strain tensor rj kl are both symmetric, 

Cijkl C jikl Cijik CjHk, (3.98) 

reducing the number of components from 81 (3 4 ) to 36. It may further be 
shown 4 that 


C ijkl ~ C k lij ? (3.99) 

which further reduces the number of independent components to 21. 

If we apply our general tensor relation (Eq. 3.97) to an isotropic body, the 
elastic constant tensor c ijk i must be a linear combination of the most general 
isotropic tensors of fourth rank. Using the results of Exercise 3.4.9, we have 

Cijkl = dSijS kl + b[d ik 5j, + 8 a 8 jk ] + c[<5 ik <5j, - S a 8 jk ]. (3. 100) 
By substituting into Eq. 3.97, we obtain 

Py = a8 tj r] kk + + r)ji) + c(^ - n n ) (3.101) 

as before. Since is symmetric, this reduces to 

Pa = artkAj + 2btiij 
in complete agreement with Eq. 3.91. 

The properties and applications of the fourth-rank Hooke’s law tensor are 
explored further in the exercises. 

All the preceding discussion of elasticity has been in a cartesian framework — 
to simplify the mathematics. But sometimes problems in the real, physical 
world simply demand some other coordinate system. In considering the free 
oscillations of the Earth, for instance, we generally rewrite the elastic relations 
of this section and the equations of motion in spherical polar coordinates. The 
result is a complicated set of simultaneous second-order partial differential 
equations which can be solved only by numerical integration (Section 8.8). The 
comparison of such theoretical-numerical results and seismological records of 
the free oscillations of our elastic Earth has yielded information about the 
structure of the Earth’s interior. 


EXERCISES 

3.6.1 The three-dimensional fourth-rank stress-strain tensor c ijkl satisfies 

C ijkl — C ijlk ~ C jilk • 


4 Compare I. S. Sokolnikoff, Mathematical Theory of Elasticity . New York: 
McGraw-Hill (1956). 



EXERCISES 149 


(a) Show that application of these symmetry conditions reduces the number of 
independent components or elements of c ijkl from 8 1 to 36, 

(b) If we further specify that 

C ijkl ~ C klij> 

show that the number of independent components drops to 21. 


3 . 6.2 (a) What arguments can you adduce to demonstrate that Poisson’s ratio o is 
nonnegative? 

(b) Assuming that the shear modulus p and the bulk modulus k are each non- 
negative, set an upper limit to the value of Poisson’s ratio. 

ANS. (b) a < 1/2. 


3 . 6.3 Calculate the elastic potential energy (per unit volume) of an elastic isotropic body 
subjected to a small strain. 

ANS. -- = i Kr\id 2 + 


3 . 6.4 The potential energy density of a strained elastic solid is given by 

P’E- = ^ C ijklVij^Jkl • 

If the solid has cubic symmetry, 

(a) Show that any c ijkh in which any subscript (1, 2, 3 or x, y, z) appears an odd 
number of times, vanishes; that is, 

C 1 112 ~ 0- 

Hint. Reflect the coordinate appearing an odd number of times. 

(b) Show that there remain three distinct nonzero elastic constants 

C lll 1 ~ C 2222 ~ C 3333 

C 1122 ~ C 2211 = C 1133’ On 

C 1212 ~ ^ 2 121 = ^ 13 1 3 > an< ^ SO ° n 

= c 122 i, and so on 

for a total of 21 elements. 


3 . 6.5 If the atomic force between every two atoms of our elastic body is along the line 
joining the two atoms and each atom is a center of symmetry, then, as shown by 
Cauchy v c ijkl = c kjil . Given (1) an isotropic elastic body and (2) this symmetry 
condition of Cauchy, show that the elastic constant tensor is completely 
symmetric under all permutations of the indices. 


3 . 6.6 If our elastic solid is isotropic , c ijkl will have 21 nonvanishing components. Express 
these 21 components in terms of Young’s modulus E and Poisson’s ratio a. 

ANS. c, nl = E- (1 ~' T) 


= E 


= E 


(1 + cr)(l — 2cr) 
o 

(1 4- o)(\ - 2o) 
1 


= 2 . 


'2(1+ a) 


p. 


3 . 6.7 The original strain tensor dujdx k is reducible in the sense of Section 3.4. A partial 
reduction, the splitting off of the antisymmetric £ ik , is carried out in the first part 
of Section 3.6. Completing the reduction, we may write 

( fill fh2 th3\ A/ 3 0 °\ All-'// 3 *?12 *113 \ 

>?21 ’ll! ^23 ) = ( 0 ^/ 3 0 )+( 121 V22-VP 123 )• 

*731 'hi 'hi) \ 0 0 ri/ 3 ) \ r] 3l ri 32 - lIV 
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Here t] is the contracted (scalar) 

(a) Show that the tensor v tj — ^6^ describes a change in volume and no change 
of shape. 

(b) Show that the second tensor = rj tj ~ jrjfyj describes a change in shape 
(shear) with no change in volume to first order. 

Note. Our elasticity theory here is a first-order theory. Discard second- and 

higher-order terms. 

3.6.8 (a) Derive the equation for waves in an elastic medium 

md 2 n/dt 2 = (k - f f/t)VV • u — juV x V x u. 

Hint. Consider the net force on a unit cube (mass m). 

(b) If the displacement u is irrotational, show that the elastic waves are propagated 
with a velocity v = [(£ 4- fju)/m] 1/2 and are longitudinal (plane waves or 
spherical waves at large distances). 

(c) If the displacement u is solenoidal, show that the elastic waves are propagated 
with velocity v =* (fi/mY 12 and are transverse (plane waves or spherical waves 
at large distanced). 


3.7 LORENTZ COVARIANCE OF MAXWELL'S 
EQUATIONS 

If a physical law is to hold for all orientations of our (real) special coordinates 
(i.e., to be invariant under rotations), the terms of the equation must be covar- 
iant under rotations (Sections 1.2. and 3.1). This means that we write the 
physical laws in the mathematical form scalar = scalar, vector = vector, 
second-rank tensor = second-rank tensor, and so on. Similarly, if a physical 
law is to hold for all inertial systems, the terms of the equation must be covariant 
under Lorentz transformations. 

Using Minkowski space (x = x t ,y = x 2 ,z = x^Jct = x 4 ), we have a four- 
dimensional cartesian space in that the metric g {j — 8^ (Section 2.1). The 
Lorentz transformations take the form of a “rotation” in this four-dimensional 
complex space. 1 

Here we consider Maxwell’s equations 


f- 

(3.102a) 

V x H = ~ + py, 
ct 

(3.1026) 

VD = p. 

(3.102c) 

VB = 0, 

( 3.102rf) 


and the relations 


1 A group theoretic derivation of the Lorentz transformation in Minkowski 
space appears in Section 4.13. See also H. Goldstein, Classical Mechanics. 
Cambridge, Mass.: Addison-Wesley (1951), Chapter 6. The tensor equation 
for a photon Yj x \ ~ 0, independent of reference frame, leads to the Lorentz 
transformations. 
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D = £ 0 E, B = ju 0 H. (3.103) 

The symbols have their usual meanings as given in the introduction. For 
simplicity we assume vacuum (s = s 0 ,/u = /i 0 ). 

We assume that Maxwell’s equations hold in all inertial systems; that is, 
Maxwell’s equations are consistent with special relativity. (The covariance of 
Maxwell’s equations under Lorentz transformations was actually shown by 
Lorentz and Poincare before Einstein proposed his theory of special relativity.) 
Our immediate goal is to rewrite Maxwell’s equations as tensor equations in 
Minkowski space. This will make the Lorentz covariance explicit. 

In terms of scalar and magnetic vector potentials, we may write 2 


B — V x A 

ex 


E = — 


8t 


\cp . 


(3.104) 


Equation 3.104 specifies the curl of A; the divergence of A is still undefined 
(compare Sections 1.13 and 1.15). We may, and for future convenience we do, 
impose the further restriction on the vector potential A, 

V-A + e o/io |^ = 0. (3.105) 


This is the Lorentz relation. It will serve the purpose of uncoupling the differen- 
tial equations for A and cp that follow. The potentials A and (p are not yet 
completely fixed. The freedom remaining is the topic of Exercise 3.7.4. 

Now we rewrite the Maxwell equations in terms of the potentials A and cp . 
From Eqs. 3.102c for V-D and 3.104 


v 2 (p + v-~= 

Ct £ n 


(3.106) 


whereas Eqs. 3.102 b for V x H and 3.104 and Eq. 1.80 of Chapter 1 yield 


e 2 a 


,d(p 


+ \^ + -i-{VV • A - V 2 A} = SI. 
8t e 0 fi 0 e 0 


(3.107) 


Using the Lorentz relation, Eq. 3.105, and the relation <':„// 0 = 1/c 2 , we obtain 

A = -MoPV, 


V 2 _iii 

c 2 dt 2 


v 2 -4 


i d 2 


c 2 dt 2 


(3.108) 


<p = 


Now the differential operator 


V 2_±il 

c 2 dt 2 


2 Compare Section 1.13, especially Exercise 1.13.10. 
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becomes in Minkowski space 



Here we adopt Greek indices, as is customary in relativity theory, to indicate a 
summation from 1 to 4. This summation, 



is a four-dimensional Laplacian, usually called the d’Alembertian and denoted 
by n 2 . It may readily be proved a scalar (see Exercise 3.2.3). 

For convenience we define 

A x = = ce 0 A x , 

Mo c 

A 2 = — C£qA , 

JU 0 C 

If we further put 


A 3 = —^=C£ 0 A z , 
fi 0 c 

At s i8 0 (p. 


(3.109) 


PVx 

c 






ip = i 4 . 


(3.110) 


then Eq. 3.108 may be written in the form 

<3.111) 

Equation 3.111 looks like a tensor equation, but looks do not constitute 
proof. To prove that it is a tensor equation, we start by investigating the 
transformation properties of the generalized current . 

Since an electric charge element de is an invariant quantity, we have 

de = p dxy dx 2 dx 3i invariant. (3. 1 12) 

We saw in Section 3.4 that the four-dimensional volume element, dx x dx 2 dx 3 
dx 4 , was also invariant. Comparing this result, Eq. 3.53 with Eq. 3.1 12, we see 
that the charge density p must transform the same way as dx A , the fourth 
component of a four-dimensional vector dx x . We put ip ~ / 4 , with i 4 now 
established as the fourth component of a four-dimensional vector. The other 
parts of Eq. 3.1 10 may be expanded as 

. _ pv x _ p dxy __ ip dx t 
1 c c dt ic dt 

(3.113) 

. dx i 

= U-r 1 - 
dx 4 

Since we have just shown that i 4 transforms as dx 4 , this means that i x transforms 
as dxy. With similar results for i 2 and i 3 , we have i k transforming as dx k , 
proving that i k is a vector, a four-dimensional vector in Minkowski space. 
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Equation 3.1 1 1, which follows directly from Maxwell’s equations, Eq. 3.102, 
is assumed to hold in all cartesian systems (all Lorentz frames). Then, by the 
quotient rule, Section 3.3, A ^ is also a vector and Eq. 3.111 is a legitimate 
tensor equation. 

Now, working backward, Eq. 3.104 may be written 


i&oEj 


_1 

li 0 c 


dAj 

* 

dx A 

dxj 

,dA k 

dAj 

dxj 

1 

H 

1 

dA x 

_d\ 

dx^ 

dx A 


j= 1,2, 3, 


(3.114) 


and cyclic permutations. 

We define a new tensor 

ft A ft A 

/lA fk\l 9 

an antisymmetric second -rank tensor, since A k is a vector. Written out explicitly, 



(3.115) 


Notice that in our four-dimensional Minkowski space E and B are no longer 
vectors but together form a second-rank tensor. With this tensor we may write 
the two nonhomogeneous Maxwell equations (3.1026 and 3.102c) combined as 
a tensor equation 


(3Jl6) 

The left-hand side of Eq. 3.116 is a four-dimensional divergence of a tensor 
and therefore a vector. This, of course, is equivalent to contracting a third-rank 
tensor df x Jdx v (compare Exercises 3.2.1 and 3.2.2). The two homogeneous 
Maxwell equations — 3. 102cz for V x E and 3. 102c/ for V • B — may be expressed 
in the tensor form 


d/23 d/31 d/12 __ q 

8x i dx 2 8 x 3 

for Eq. 3. 102c/ and three equations of the form 


d/34 . d/4; 


/ 34 42 _|_ 23 __ Q 


d^23 _ | 
8x 4 


(3.117) 


(3.118) 


dx 2 dx 3 

for Eq. 3.102<z. (A second equation permutes 124, a third permutes 134.) Since 
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is a tensor (of third rank), Eqs. 3.102a and 3.102d are given by the tensor 
equation 

hfiv + Uxn + = 0- P- 1 19) 

From Eqs. 3.117 and 3.118 the reader will understand that the indices A, fi , , 
and v are supposed to be different. Actually Eq. 3.119 automatically reduces 
to 0 = 0 if any two indices coincide. An alternate form of Eq. 3.119 appears 
in Exercise 3.7.14. 


Lorentz Transformation of E and B 

The construction of the tensor equations (3.116 and 3.119) completes our 
initial goal of rewriting Maxwell’s equations in tensor form. 3 Now we exploit 
the tensor properties of our four vectors and the tensor 

For the Lorentz transformation corresponding to motion along the z(x 3 )- 
axis with velocity v, the “direction cosines” are 4 


where 


and 



7 = a - p 2 r 112 . 


(3.120) 


(3.121) 


Using the tensor transformation properties, we may calculate the electric and 
magnetic fields in the moving system in terms of the values in the original 
reference frame. From Eqs. 3.13, 3.115, and 3.120 we obtain 


E'=-J===(E x -vB y ), 

E ; = ~J=(E y + vB x ), (3.122) 

vl"P 

E* = E z , 


3 Modern theories of quantum electrodynamics and elementary particles are 
often written in this “manifestly covariant” form to guarantee consistency 
with special relativity. Conversely, the insistence on such tensor form has 
been a useful guide in the construction of these theories. 

4 A group theoretic derivation of the Lorentz transformation appears in 
Section 4.13. See also Goldstein, Chapter 6. 
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(3J23) 

B' z = B z . 

This coupling of E and B is to be expected. Consider, for instance, the case 
of zero electric field in the unprimed system 

E x = E y = E z = 0. 

Clearly, there will be no force on a stationary charged particle. When the 
particle is in motion with a small velocity v along the z-axis, 5 an observer on 
the particle sees fields (exerting a force on his charged particle) given by 

E' x = - vB y , 

Ey = vB x , 

where B is a magnetic induction field in the unprimed system. These equations 
may be put in vector form 

E = v x B 

or (3.124) 

F = q\ x B, 

which is usually taken as the operational definition of the magnetic induction B. 


Electromagnetic Invariants 

Finally, the tensor (or vector) properties allow us to construct a multitude 
of invariant quantities. A more important one is the scalar product of the two 
four-dimensional vectors or four vectors A a and i x . We have 


A k i x = cs 0 A x Ef + ce 0 A y ~ 


+ ce 0 A z 4- ie 0 (pip (3.125) 

= e 0 (A • J — pep), invariant, 

with A the usual magnetic vector potential and J the ordinary current density. 
The final term pq> is the ordinary static electric coupling with dimensions of 
energy per unit volume. Hence our newly constructed scalar invariant is an 
energy density. The dynamic interaction of field and current is given by the 
product A* J. This invariant A ; i ; appears in the electromagnetic Lagrangians 
of Exercise 17.3.6 and 17.5.1. 


5 If the velocity is not small (so that v 2 /c 2 is negligible), a relativistic transforma- 
tion of force is needed. 
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Other possible electromagnetic invariants appear in Exercises 3.7.9 and 
3.7.11. 


EXERCISES 

3 . 7.1 (a) Show that every four vector in Minkowski space may be decomposed into 

an ordinary three-space vector and a three-space scalar. Examples: (r, id), 
(pv/c, ip), (cs 0 A, i£ 0 (p), (p, iEjc), (k, ico/c). 

Hint . Consider a rotation of the three-space coordinates with time fixed, 
(b) Show that the converse of (a) is not true— every three vector plus scalar 
does not form a Minkowski four vector. 

3 . 7.2 (a) Show that 



(b) Show how the previous tensor equation may be interpreted as a statement 
of continuity of charge and current in ordinary three-dimensional space 
and time. 

(c) If this equation is known to hold in all Lorentz reference frames, why can 
we not conclude that is a vector? 

3 . 7.3 Write the Lorentz condition (Eq. 3.105), as a tensor equation in Minkowski 
space. 

3 . 7.4 A gauge transformation consists of varying the scalar potential (p t and the 
vector potential according to the relation 

<P2 = <P 1 +^> 

A 2 = Aj — V*. 

The new function % is required to satisfy the homogeneous wave equation 

V 2 X-£oMo |^f = 0. 

Show the following : 

(a) The Lorentz relation is unchanged. 

(b) The new potentials satisfy the same inhomogeneous wave equations as 
did the original potentials. 

(c) The fields E and B are unaltered. 

The invariance of our electromagnetic theory under this transformation is called 
gauge invariance. 

3 . 7.5 A charged particle, charge q , mass m, obeys the Lorentz covariant equation 

dpjdx = 

p v is the four- dimension al momentum vector (p 1 ,p 2 ,p 3 ,iE/c). x is the proper 
time; dr = dtyjl—~v*jc* y a Lorentz scalar. Show that the explicit space -time 
forms are 

dyjdt = q{YL + vxB) 
dEfdt = q\ • E. 
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3. 7. 6 From the Lorentz transformation matrix elements (Eq. 3. 1 20) derive the Einstein 
velocity addition law 

, _ u — v _ u' - ft? 

1 — (uv/c 2 ) 1 + ( u'v/c 2 ) 9 

where u — icdx 3 /dx 4 and u — icdx' 3 /dx' 4 . 

Hint. If L i2 (v) is the matrix transforming system l into system 2, L 23 (u') the 
matrix transforming system 2 into system 3, L l3 (u) the matrix transforming 
system 1 directly into system 3, then L 13 (u) = L 23 (u')L l2 (v). From this matrix 
relation extract the Einstein velocity addition law. 

3.7.7 The dual of a four-dimensional second-rank tensor B may be defined by B*, 
where the elements of the dual tensor are given by 

~ Yi £ ijkl^kl- 


Show that B* transforms as 

(a) a second-rank tensor under rotations, 

(b) a pseudotensor under inversions. 

Note. The asterisk here does not mean complex conjugate. 


3.7.8 


Construct f *, the dual of f, where f is the electromagnetic tensor given by Eq. 
3.115. 


ANS. f * = e 0 


This corresponds to 



~iE z 

iE y 

cb; 

0 

-iE x 

CBy 

iE x 

0 

cB z 

- cB, 

-cB z 

0 


cB -> — i’E, 
— z'E — ► cB. 


This transformation, sometimes called a “dual transformation,” leaves Maxwell’s 
equations in vacuum (p = 0) invariant. 


3.7.9 As the quadruple contraction of a fourth-rank pseudotensor and two second-rank 
tensors z ll x v<r f ll xf va is clearly a pseudoscalar. Evaluate it. 

' ANS. — 8/EqcB’E. 

3.7.10 (a) If an electromagnetic field is purely electric (or purely magnetic) in one 

particular Lorentz frame, show that E and B will be orthogonal in other 
Lorentz reference systems. 

(b) Conversely, if E and B are orthogonal in one particular Lorentz frame, there 
exists a Lorentz reference system in which E (or B) vanishes. Find that 
reference system. 

3.7.11 Show that c 2 B 2 — E 2 is a scalar invariant. 


3.7.1 2 Since (dx l , dx 2 ,dx 2 , dx 4 ) is a vector, dx n dx M is a scalar. Evaluate this scalar for 
a moving particle in two different coordinate systems : (a) a coordinate system 
fixed relative to you (lab system), and (b) a coordinate system moving with a 
moving particle (velocity v relative to you). With the time increment labeled 
dx in the particle system and dt in the lab system, show that 

dx = dtyjl — v 2 [c 2 . 

dx or r is the proper time of the particle, a Lorentz invariant quantity. 



158 TENSOR ANALYSIS 


3.7.1 3 Expand the scalar expression 


An U'Uv + 


in terms of the fields and potentials. The resulting expression is the Lagrangian 
density used in Exercise 17.5.1. 


3.7.14 Show that Eq. 3.119 may be written 


^ = 0 . 
dx 


3.8 NONCARTESIAN TENSORS. CO VARIANT 
DIFFERENTIATION 

The distinction between contravariant transformations and covariant trans- 
formations was established in Section 3.1. Then, for convenience, we restricted 
our attention to cartesian coordinates (in which the distinction disappears). 
Now in these two concluding sections we return to noncartesian coordinates 
and resurrect the contravariant and covariant dependence. As in Section 3.1, 
a superscript will be used for an index denoting contravariant and a subscript 
for an index denoting, covariant dependence. The metric tensor of Section 2.1 
will be used to relate contravariant and covariant indices. 

The emphasis in this section is on differentiation, culminating in the construc- 
tion of the covariant derivative. We saw in Section 3.2 that the derivatives of a 
vector yields a second-rank tensor — in cartesian coordinates. The covariant 
derivative of a vector yields a second-rank tensor in noncartesian coordinate 
systems. 

Metric Tensor, Raising and Lowering Indices 

Let us start with a set of basis vectors such that an infinitesimal displace- 
ment dr would be given by 

dr = £ l dq 1 -f -z 2 dq 2 + t 3 dq 3 . (3.126) 

For convenience we take c 1? e 2 > an< i £ 3 to f° rm a right-handed set. These 
vectors are not necessarily orthogonal. The oblique coordinates of Section 4.4 
furnish a convenient example of a nonorthogonal system. Also, a limitation to 
three-dimensional space will be required only for the discussions of cross 
products and curls. Otherwise these c, may be in TV-dimensional space, including 
the four-dimensional space-time of special and general relativity. The basis 
vectors may be expressed by 



as in Exercise 2.2.3. Note, however, that the c t « here do not necessarily have unit 
magnitude. From Exercise 2.2.3, the unit vectors, are 
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e, = — —— (no summation) 
hi dq t 

and therefore 

z t = h i e i (no summation). (3.128) 

The are related to the unit vectors e t by the scale factors h { of Section 2.2. 
The e ( have no dimensions; the £ t have the dimensions of h { . In spherical polar 
coordinates, as a specific example, 

£ r = le r 

t B = re e (3.129) 

= rsin0e v . 

As in Section 2.1, we construct the square of a differential displacement 
(ds ) 2 = dr • dr = (g ; dq 1 ) 2 

(3.130) 

= z i -z j dq i dq j . 

Comparing this with (ds ) 2 of Section 2.1, Eq. 2.4, we identify £, • Zj as the 
covariant metric tensor 


E r t j = g ij . (3.131) 

Clearly, g {j is symmetric. The tensor nature of g tj follows from the quotient 
rule, Exercise 3.3.1. We take the relation 

g ik g kj = S) (3.132) 

to define the corresponding contravariant tensor g lk . Contravariant g lk enters 
as the inverse 1 of co variant g kj . We use this contravariant g lk to raise indices, 
converting a covariant index into a contravariant index, as shown subsequently. 
Likewise the covariant g kj will be used to lower indices. The choice of g lk and 
g kj for this raising-lowering operation is arbitrary. Any second-rank tensor 
(and its inverse) would do. Specifically, we have 


Then 


g lj Zj = z l relating covariant and 

contravariant basis vectors, 

g lj Fj = F relating covariant and 

contravariant vector components. 


(3.133) 


Qa** = 
gijF j = f, 


as the corresponding index 
lowering relations. 


(3.134) 


1 If the tensor g kj is written as a matrix, the tensor g lk is given by the inverse 

matrix. 
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As an example of these transformations we start with the contra variant form 
of vector 


F = F%. 

From Eqs. 3.133 and 3.134 

F = W 


(3.135) 


(3.136) 


the final equality coming from Eq. 3.132. Equation 3.135 gives the contra- 
variant representation of F. Equation 3.136 gives the corresponding covariant 
representation of the same F. Examples of such representation appear in 
Section 4.4, “Oblique Coordinates.” 

It should be emphasized again that the £* and z j do not have unit magnitude. 
This may be seen in Eqs. 3.129 and in the metric tensor g tj for spherical polar 
coordinates and its inverse g ij : 


(dij) = 





Derivatives, Christoffel Symbols 
Let us form the differential of a scalar 

# = (3.137) 

Since the dq l are the components of a contravariant vector, the partial deriva- 
tives dxjjjdq 1 must form a covariant vector — by the quotient rule. The gradient 
of a scalar becomes 


= (3.138) 

The reader should note that difs/Sq 1 are not the gradient components of Section 
2.2 — because £ l =/= e* of Section 2.2. 

Moving on to the derivatives of a vector, we find that the situation is much 
more complicated because the basis vectors are in general not constant. 
Remember we are no longer restricting ourselves to cartesian coordinates and 
the nice, convenient i, j, k! ; Direct differentiation yields 


av = 

dq J dq j * l 



(3.139) 


Now 8z i/dq j will be some linear combination of the z k with the coefficient 
depending on the indices i and j from the partial derivative and index k from 
the base vector. We write 
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Multiplying by £ m , we have 


dZj 

dq j 


i — Y k f 
• ~ 1 ij*k- 



(3.140a) 


(3.1406) 


The T fc 0 is a Christoffel symbol (of the second kind). It is also called a "coefficient 
of connection.” These T* fj - are not third-rank tensors and the dV 1 / dq j of Eq. 
3.139 are not second-rank tensors. Equation 3.140 should be compared with 
the results quoted in Exercise 2.2.3 (remembering that in general £ t =/= ej. In 
cartesian coordinates, rjj — 0 for all values of the indices i, j, and k. These 
Christoffel three index symbols may be computed by the techniques of Chapter 
2. This is the topic of Exercise 3.8.7. Equation 3.153 at the end of this section 
offers an easier method. 

Using Eq. 3.127, we obtain 


= 8 2 r = dtj 

dq J 8q J dq l 8q l 

= rV*. 


(3.141) 


Hence these Christoffel symbols are symmetric in the two lower indices : 


r\. 


(3.142) 


Covariant Derivative 

With the Christoffel symbols, Eq. 3.139 may be rewritten 


dv 

dq j 


dV‘ 

dq- 


j£ , + VT ijZ k . 


(3.143) 


Now i and k in the last term are dummy indices. Interchanging i and k (in this 
one term), we have 


5V 

dq j 




(3.144) 


The quantity in parenthesis is labeled a covariant derivative, V'.j. We have 


SV‘ , . 
V‘ ■ = — + V k V, • 
;J 8q j kr 


(3.145) 


The .j subscript indicates differentiation with respect to q j . The differential 
d\ becomes 


dy = W jdqj = W jd9 ^* im (3 * 146) 

A comparison with Eq. 3.126 or 3.135 shows that the quantity in square 
brackets is the zth contravariant component of a vector. Since dq j is the yth 
contravariant component of a vector (again, Eq. 3.126), V.) must be the i/th 



162 TENSOR ANALYSIS 


component of a (mixed) second-rank tensor (quotient rule). The covariant 
derivatives of the contravariant components of a vector form a mixed second- 
rank tensor, K), 

Since the Christoffel symbols vanish in cartesian coordinates, the covariant 
derivative and the ordinary partial derivative coincide 


= K ; (cartesian coordinates) 
cq J ,J 


(3.147) 


The covariant derivative of a covariant vector V t is given by (Exercise 3.8.8) 


v.=m 

l,J 8q J 


nr k ij. 


(3.148) 


Like Vj, V i;j is a second-rank tensor. 

The physical importance of the covariant derivative is that 


A consistent replacement of regular partial derivatives by covariant derivatives 
carries the laws of physics (in component form) from flat space time into the curved 
(Riemannian) space time of general relativity. Indeed, this substitution may be taken 
as a mathematical statement of Einstein’s principle of equivalence . 2 


The Christoffel Symbols as Derivatives of the 
Metric Tensor 

It is often convenient to have an explicit expression for the Christoffel 
symbols in terms of derivatives of the metric tensor. As an inital step, we define 
the Christoffel symbol of the first kind [(/, k ] by 

(3.149) 

This [ij\k] is not a third-rank tensor. From Eq. 3.140/) 




= 


dZj 
k dq j 


Now we differentiate g tj = Eq. 3.131: 


d£j 


dQij _ dZj 

dq k ' 

= [*,y] + [jk, 0 


dq k dq k Ej + E ‘ 0q k 


by Eq. 3.150. 
Then 


\- lJ ’ ki -2\dqj + dq‘ dq k y 


(3.150) 


(3.151) 


(3.152) 


2 Misner, C. W., K. S. Thorne, and J. A. Wheeler, Gravitation. San Francisco: 
W. H. Freeman (1973), p. 387. 
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and 

r S y = <7* s [y, &] 

1 ksfd^,fyjk_%hi\ (3 ' 153) 

2 y \dq J dq l dq k ) 

These Christoffel symbols and the covariant derivatives are applied in the next 
section. 


EXERCISES 


3.8.1 Equations 3.128 and 3.129 use the scale factor h i9 citing Exercise 2.2.3. In Section 
2.2 we had restricted ourselves to orthogonal coordinate systems, yet Eq. 3.128 
holds for nonorthogonal systems. Justify the use of Eq. 3.128 for nonorthogonal 
systems. 

3. 8. 2 (a) Show that £* • Ej = 5j. 

(b) From the result of part (a) show that 

F l = F-£ z and Fj = F*£ f . 

3.8.3 For the special case of three-dimensional space (£j , £ 2 , £ 3 defining a right-handed 
coordinate system, not necessarily orthogonal) show that 

£• X £ k 

£ l = , i,j, k = 1, 2, 3 and cyclic permutations. 

£j X £ fe * £• 

Note . These contravariant basis vectors, £*, define the reciprocal lattice space 
of Sections 1.5 and 4.4. 

3.8.4 Prove that the contravariant metric tensor is given by 

g ij = z l 'z j . 

3.8.4A If the covariant vectors £ t are orthogonal, show that 

(a) is diagonal, 

(b) 1 g n = 1 lg H (no summation), 

(c) ' |s‘| = 1 /|e,|. 

3.8.5 Derive the covariant and contravariant metric tensors for circular cylindrical 
coordinates. 


3.8.6 


Transform the right-hand side of Eq. 3.138 


\\j/ = 



into the e, basis and verify that this expression agrees with the gradient developed 
in Section 2.2 (for orthogonal coordinates). 


3.8.7 Evaluate d£ i /dq j for the spherical polar coordinates, and from these results 
calculate the spherical polar coordinate T k 0 -. 

Note. Exercise 2.5.1 offers a way of calculating the needed partial derivatives. 
Remember 


£ t = r 0 but £ 2 = r0 o and £ 3 = rsin0<p o . 
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3.8.8 Show that the covariant derivative of a covariant vector is given by 

F- . = — — v k r k - 

,;j dq> k ,J 


Hint. Differentiate 




3.8.9 


Verify that V i:j = g ik V k by showing that 


8V, 

dq j 


KF’ij = 9ik 


\8V k k 

,j ~ 1 mj 


W 




3.8.10 From the circular cylindrical metric tensor, g ti , calculate the r k fj for circular 
cylindrical coordinates. 

Note. There are only three non vanishing P s. 

3.8.11 Using the T k tj from Exercise 3.8.10, write out the covariant derivatives V l ;J of 
a vector V in circular cylindrical coordinates. 


3.8.12 A triclinic crystal is described, using an oblique coordinate system. The three 
covariant base vectors are 

= 1 .51, 

g 2 = 0.4i + 1.6j 
and 


s 3 = 0.2i + 0.3j H- 1.0k. 

(a) Calculate the elements of the covariant metric tensor g y . 

(b) Calculate the Christoffel three index symbols, P 0 . (This is a “by inspection'' 
calculation.) 

(c) From the cross-product form of Exercise 3.8.3 calculate the contravariant 
base vector £ 3 . 

(d) Using the explicit forms of £ 3 and £,-, verify that £ 3 *£, = <5 3 , 

Note. If it were needed, the contravariant metric tensor could be determined 
by finding the inverse of g Lj or by finding the £ l and using g lJ = £* -z J . 


3.8.13 Verify that 



d M 

dq k ]' 


Hint. Substitute Eq. 3.151 into the right-hand side and show that an identity 
results. 


3.9 TENSOR DIFFERENTIAL OPERATIONS 

In this section the covariant derivative of Section 3.8 is applied to rederive 
the vector differential operations of Section 2.2 in general tensor form. 

Divergence 

Replacing the partial derivative by the co variant derivative, we take the 
divergence to be 
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VV = V'.i 


8V‘ 

8q‘ 


+ 


ykp 


ik > 


Expressing r‘ jt by Eq. 3.153, we have 


pi n im \dgim , _ fyik] 

ik 2 g \dq k 8q' 8q m ] 


(3.154) 


(3.155) 


When contracted with g im the last two terms in the curly bracket cancel, since 


gim Sg km __ „mi dgjq _ „im dg± k 

y 8q' y 8q m y 8q m ' 

then 




From the theory of determinants Section 4.1, 



im j m 

dq k ’ 


(3.156) 


(3.157) 


(3.158) 


where g is the determinant of the metric, g = det(g^). Substituting this result 
into Eq. 3.157, we obtain 


This yields 


1 dg ^ 1 8g 1/2 
2 g dq k g 1/2 dq k ' 


v-v = v l ;i 


1 8 

g 1/2 dq k 


(i g 1/2 v k ). 


(3.159) 


(3.160) 


To compare this result with Eq. 2.17, note that h l h 2 h 3 — g l/2 and V 1 (contra- 
variant coefficient of £,) = VJhi (no summation), where V { is the Section 2.2 
coefficient of e 4 - . 


Laplacian 

In Section 2.2 replacement of the vector V in V • V by Vi p led to the Laplacian 
Here we have a contravariant V\ Using the metric tensor to create a 
contravariant \if/, we make the substitution 


9 


8q k 


Then the Laplacian V • Vi j/ becomes 

J 

g 1/2 dq l 




(3.161) 


For the orthogonal systems of Section 2.2 the metric tensor is diagonal and the 
contravariant g" becomes 
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Equation 3.161 reduces to 

= 

in agreement with Eq. 2.18 a. 


g“ = (hi) 2 
1 d f h 1 h 2 h 3 d\)i 


h 1 h 2 h 3 dq i \ hf dq 1 


Curl 

The difference of derivatives that appears in the curl (Eq. 2.21) will be 
written 


dVt__8Vj 
8q j dq 1 ' 


Again, remember that the components V { here are coefficients of the contra- 
variant (nonunit) base vectors £ l . The V t of Section 2.2 are coefficients of unit 
vectors e t . Adding and subtracting, we obtain 


SVt 

8q J 


r q = r 

— VY k — — 4 - V T k 
dq 1 dq j k lJ dq i+ k Jl 


(3.162) 


The characteristic difference of derivatives of the curl becomes a difference of 
co variant derivatives and therefore is a second-rank tensor (covariant in both 
indices). As emphasized in Section 3.4, the special vector form of the curl exists 
only in three-dimensional space. 

From Eq. 3.153 it is clear that all the Christoffel three index symbols vanish 
in Minkowski space (g Xfl — & and in the real space-time of special relativity 
with 



This completes the development of the differential operators in general 
tensor form. (The gradient was given in Section 3.8.) In addition to the fields of 
elasticity and electromagnetism, these differential forms find application in 
mechanics (Lagranian mechanics, Hamiltonian mechanics, and the Euler equa- 
tions for rotation of rigid body); fluid mechanics; and perhaps most important 
of all, in the curved space-time of modern theories of gravity. 
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EXERCISES 


3 . 9.1 Verify Eq. 3.158 


dq k 


= gg 



for the specific case of spherical polar coordinates. 


3 . 9.2 Starting with the divergence in tensor notation, Eq. 3.160, develop the divergence 
of a vector in spherical polar coordinates, Eq. 2.45. 


3 . 9.3 The covariant vector A t is the gradient of a scalar. Show that the difference of 
covariant derivatives A^.j — A j;i vanishes. 
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4 DETERMINANTS, 
MATRICES, AND 
GROUP THEORY 

“ Disciplined judgement about what is neat 
and symmetrical and elegant has time and 
time again proved an excellent guide to 
how nature works” 

Murray Gell-Mann 


4. 1 DETERMINANTS 

We begin our study of matrices by summarizing some properties of deter- 
minants, partly because determinants are useful in matrix analysis and partly 
to illustrate, by way of contrast, what matrices are not. The concept of “deter- 
minant” and the notation were introduced by Leibnitz. 

Properties 

A determinant is (1) a square array of numbers or functions that (2) may be 
combined together according to the rule that follows. We have 



a k 

b i 

Cl 




b 2 

c 2 


z> = 

a 3 

b 3 

C 3 • 



a„ 

b n 

c„ ■ 



The number of columns (and of rows) in the array is sometimes called the 
order of the determinant . In terms of its elements, a i9 b } , and so on the value of 
the determinant D is 

D= X e ijk ...a i b j c k ---, (4.2) 

iJ,k ••• 

where analogous to the Levi-Civita symbol of Section 3.4 is + 1 for even 
permutations 1 of (1,2,3, ... , n), — 1 for odd permutations, and zero if any 
index is repeated. 

1 In a linear array abed..., any single, simple transposition of adjacent 
elements yields an odd permutation of the original array: abed -> bacd. Two 
such transpositions yield an even permutation. In general, an odd number of 
such interchanges of adjacent elements results in an odd permutation ; an even 
number of such transpositions yields an even permutation. 


168 
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Specifically, for the third-order determinant, 

\a x b i c 


D = 


2 b 2 

3 b 3 


Equation 4.2 leads to 

D = + a 1 b 2 c 3 — a 1 b 3 c 2 + a 2 b 3 c 1 — a 2 b x c 3 + a 3 b i c 2 — a 3 b 2 c { , 


(4.3) 


(4.4) 


with six terms in the sum. 

The third-order determinant, then, is this particular linear combination of 
products. Each product contains one and only one element from each row and 
from each column. Each product is added in if the order represents an even 
permutation of rows (the columns being in a, b, c or 1 , 2, 3 order) and sub- 
tracted if we have an odd permutation. Equation 4.3 may be considered 
shorthand notation for Eq. 4.4. The number of terms in the sum (Eq. 4.2) is 24 
for a fourth-order determinant, n\ for an nth-order determinant. Because of 
the appearance of the negative signs in Eq. 4.4 (and possibly in the individual 
elements, a t , b j9 . . . , as well), there may be considerable cancellation. It is 
quite possible that a determinant of large numbers will have a very small value. 

Several useful properties of the nth-order determinants follow from Eq. 4.2. 
Again, to be specific, Eq. 4.4 for third-order determinants is used to illustrate 
these properties. 


Laplacian Development by Minors 

Equation 4.4 may be written 

D = a^Cj - b 3 c 2 ) - a 2 {b^c 2 - b 3 Cj) + a 3 (b 1 c 2 - b 2 c x ) 


b 2 

C 2 

|*i 

^2 i 


+ a 3 

*1 Cl 

b 3 

C 3 

1*3 

c 3 

*2 C 2 


In general, the nth-order determinant may be expanded as a linear combination 
of the products of the elements of any row (or any column) and the {n — 1)- 
order determinants formed by striking out the row and column of the original 
determinant in which the element appears. This reduced array (2 x 2 in this 
specific example) is called a “minor.” If the element is in the ith row and the 
y'th column, the sign associated with the product is ( — l) 1 ^. The minor with 
the sign (— l) i+J is called the “cofactor.” If M i} is used to designate the minor 
formed by omitting the ith row and the yth column and Qj is the corresponding 
cofactor, Eq. 4.5 becomes 

= t a i C n- 

i—i i = 1 

In this case, expanding down the first column, we have j = 1 and the summation 
over i. 

This Laplace expansion may be used to advantage in the evaluation of 
high-order determinants in which a lot of the elements are zero. For example, 
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to find the value of the determinant 

0 1 0 01 


D 


-1 0 

0 0 

0 0 


0 0 

0 1 

-1 0 


we expand across the top row to obtain 

1-1 0 0 

0 0 1 

0-10 

Again, expanding across the top row, we get 

i+W-nl ° ] | 
-1 0 


Z> = (-l) 1+2 -(l) 


Z> = ( — 1) •( — l) 1 1 •( — 1) 


0 1 

-1 0 


= 1. 


(4.6) 


(4.7) 


(4.8) 


This determinant D (Eq. 4.6) is formed from one of the Dirac matrices appearing 
in Dirac’s relativistic electron theory. 


Antisymmetry 

The determinant changes sign if any two rows are interchanged or if any 
two columns are interchanged. This follows from the even-odd character of 
the Levi-Civita e in Eq. 4.2 or explicitly from the form of Eqs. 4.3 and 4.4. 2 

This property was used in Section 3.4 to develop a totally antisymmetric 
linear combination. It is also frequently used in quantum mechanics in the 
construction of a many particle wave function that, in accordance with the 
Pauli exclusion principle, will be antisymmetric under the interchange of any 
two identical spin \ particles (electrons, protons, neutrons, etc.). 

As a special case of antisymmetry, any determinant with two rows equal or 
two columns equal equals zero. 

If each element in a row or each element in a column is zero, the determinant 
is equal to zero. 

If each element in a row or each element in a column is multiplied by a 
constant, the determinant is multiplied by that constant. 

The value of a determinant is unchanged if a multiple of one row is added 
(column by column) to another row or if a multiple of one column is added 
(row by row) to another column. 

We have 


2 The sign reversal is reasonably obvious for the interchange of two adjacent 
rows (or columns), this clearly being an odd permutation. The reader may 
wish to show that the interchange of any two rows is still an odd permutation. 
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a i 

bi 



a t + kb x 

b , c, 

a 2 

b 2 

C 2 

= 

d 2 -}- kb 2 

b 2 c 2 

«3 

b 3 

C 3 


a 3 -f kb 3 

b$ c 3 


Using the Laplace development on the right-hand side, we obtain 


a x + kb x b x c x 


a i b x 


b\ b x c, 

a 2 + kb 2 b 2 c 2 

= 

d 2 b 2 ^2 

+ k 

b 2 b 2 c 2 

a 3 + kb 3 b 3 c 3 


a 3 b 3 c 3 


b 3 b 3 c 3 


(4.9) 


(4.10) 


then by the property of antisymmetry the second determinant on the right-hand 
side of Eq. 4.10 vanishes, verifying Eq. 4.9. 

As a special case, a determinant is equal to zero if any two rows are propor- 
tional or any two columns are proportional. 

Some useful relations involving determinants of matrices appear in the 
exercises of Sections 4.2 and 4.5. 


Solution of a Set of Homogeneous Equations 

One of the major applications of determinants is in the establishment of a 
condition for the existence of a nontrivial solution for a set of linear homo- 
geneous algebraic equations. Suppose we have three homogeneous equations 
with three unknowns (or n equations with n unknowns) 

a y x + b x y + c 1 z — 0, 

a 2 x + b 2 y + c 2 z = 0, (4. 1 1) 


d 3 x + b 3 y -f c 3 z — 0. 


The problem is to determine whether any solution, apart from the trivial one 
x — 0, y = 0, z = 0, exists. 

By forming the determinant of the coefficients of Eq. 4.1 1 and then multiply- 
ing by x, 



<*l 

b i 

Cl 


a x x 

b i 

Cl 

X 

d 2 

b 2 

C 2 

= 

d 2 X 

b 2 

c 2 


a 3 

b 3 

' C 3 


a 3 x 

^3 

c 3 


Now, adding to the first column y times the second column and z times the 
third column, we get 



d x b i c x 


a 1 x + b 1 y + c i z b i c t 

X 

&2 b 2 C 2 

= 

d 2 x -f b 2 y -h c 2 z b 2 c 2 


&3 &3 C 3 


a 3 x + b 3 y + c 3 z b 3 c 3 


This step follows from Eq. 4.9, but by Eq. 4.1 1 each element of the first column 
vanishes. Then 
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a i 


Cl 


0 

b i 

Cl 

X 

a 2 

b 2 

C 2 

= 

0 

b 2 

C 2 


a 3 

b 3 

C 3 


0 

b 3 

c 3 


(4.13) 


Therefore x (and y and z) must be zero unless the determinant of the coefficients 
vanishes. Conversely, we can show that if the determinant of the coefficients 
vanishes, a nontrivial solution does indeed exist. This is used in Section 8.6 to 
establish the linear dependence or independence of a set of functions. 


Solution of a Set of Nonhomogeneous Equations 
If our linear algebraic equations are nonhomogeneous, that is, if the zeros 
on the right-hand side of Eq. 4.11 are replaced by d i ,d 2 , and d 3 , respectively, 
then from Eq. 4.12 we obtain, 3 in place of Eq. 4.13, 



d i 

b i 

Cl 

1 

«1 

b 1 

G 

X = 

d 2 

b 2 

c 2 


«2 

b 2 

C 2 


d 3 

b 3 

c 3 

1 

«3 

^3 

G 


If the determinant of the coefficients (the denominator) vanishes, the non- 
homogeneous set of equations has no solution — unless the numerators also 
vanish. In this case solutions may exist but they are not unique (see Exercise 
4.1.3 for a specific example). 

For numerical work, this determinant solution, Eq. 4.14, is exceedingly 
unwieldy. The determinant may involve large numbers with alternate signs, 
and in the subtraction of two large numbers the relative error may soar to a 
point that makes the result worthless. Also, although the determinant method 
is illustrated here with 3 equations and 3 unknowns, we might easily have 20 
equations with 20 unknowns. From the definition of determinant (Eq. 4.2), 
our flth-order determinant will have nl terms. If we were to ask a high-speed 
electronic computer to compute these n ! terms at the rate of one each micro- 
second, the computer would still take 20 ! microseconds or 77,000 years. There 
must be a better way. 

In fact, there are better ways. One of the best is a straightforward elimination 
process often called Gauss elimination. To illustrate this technique, consider 
the following set of equations. 


EXAMPLE 4. 1 .4 Gauss Elimination 
Solve 


3x 4- 2y + z = 1 1 

2x + 3y + z = 1 3 (4. 1 5) 

x + y + 4z = 12. 


3 Exercise 1.5.13 gives the vector analog of Eq. 4.14. 
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For convenience and for the optimum numerical accuracy, the equations 
are rearranged so that the largest coefficients run along the main diagonal 
(upper left to lower right). This has already been done in the preceding set. 

The Gauss technique is to use the first equation to eliminate the first un- 
known, x, from the remaining equations. Then the (new) second equation is 
used to eliminate y from the last equation. In general, we work down through 
the set of equations, and then, with one unknown determined, we work back 
up to solve for each of the other unknowns in succession. 

Dividing each row by its initial coefficient, we see that Eqs. 4. 1 5 become 

x + 0.6667^ + 0.3333z = 3.6667 

x+ 1.5000y + 0.5000z= 6.5000 (4.16) 

x + l.OOOOy + 4.00002 = 12.0000. 

Now, using the first equation, we eliminate x from the second and third : 
x + 0.6667y + 0.3333z = 3.6667 

0.8333_y + 0.1667z = 2.8333 (4.17) 

0.3333y + 3.6667 z = 8.3333, 

and 

x + 0.6667y+ 0.3333z = 3.6667 

y + 0.2000 z= 3.4000 (4.18) 

y + 1 l.OOOOz = 25.0000. 

Repeating the technique, we use the new second equation to eliminate y 
from the third equation : 

x + 0.6667y + 0.3333z = 3.6667 

0.2000z = 3.4000 (4.19) 

10.8000z = 21.6000, 


or 

z = 2.0000. 


Finally, working back up, we get 

y + 0.2000 x 2.0000 = 3.4000, 
or 


y = 3.0000. 


Then with z and y determined, 

x + 0.6667 x 3.0000 + 0.3333 x 2.0000 = 3.6667, 


and 
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x = 1.0000. 

The technique may not seem so elegant as Eq. 4.14, but it is well adapted to 
modern computing machines and is far faster than the time spent with deter- 
minants. 

This Gauss technique may be used to convert a determinant into triangular 
form: 

a x b x c x 
D — 0 b 2 c 2 
0 0 c 3 

for a third-order determinant. In this form D = a 1 b 2 c 3 . For an nth-order 
determinant the evaluation of the triangular form requires only n — 1 multiplica- 
tions compared with the n \ required for the general case. 

A variation of this progressive elimination is known as Gauss-Jordan 
elimination. We start as with the preceding Gauss elimination, but each new 
equation considered is used to eliminate a variable from all the other equations, 
not just those below it. If we had used this Gauss-Jordan elimination, Eq. 4.19 
would become 

■x + 0.20002 = 1.4000 

y + 0.20002 = 3.4000 (4.20) 

z = 2.0000, 

using the second equation of Eq. 4.18 to eliminate y from both the first and 
third equations. Then the third equation of Eq. 4.20 is used to eliminate 2 
from the first and second, giving 

x = 1.0000 

y = 3.0000 (4.21) 

2 = 2.0000. 

We return to this Gauss-Jordan technique in Section 4.2 for inverting matrices. 

Another technique suitable for computer use is the Gauss-Seidel iteration 
technique. Each technique has its advantages and disadvantages. The Gauss 
and Gauss-Jordan methods may have accuracy problems for large deter- 
minants. This is also a problem for matrix inversion (Section 4.2). The Gauss- 
Seidel method, as an iterative method, may have convergence problems. The 
IBM Scientific Subroutine Package (SSP) uses Gauss and Gauss-Jordan tech- 
niques. The Gauss-Seidel iterative method and the Gauss and Gauss-Jordan 
elimination methods are discussed in considerable detail by Ralston and Wilf 
and also by Pennington. 4 

4 Ralston, A., and H. Wilf, Eds., Mathematical Methods for Digital Com- 
puters. New York: Wiley (1960). Pennington, R. H., Introductory Computer 
Methods and Numerical Analysis , New York : Macmillan (1970). 
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EXERCISES 


4.1 .1 Evaluate the following determinants 


(a) 


1 0 1 
0 1 0 
1 0 0 


(b) 0 V3 0 0 

y/3 0 2 0 

y/2 0 2 0 ^3 

0 0 V3 0 


(c) 


1 2 0 
3 1 2 

0 3 1 


4.1 .2 Test the set of linear homogeneous equations 

x + 3y + 3z = 0, 
x — y + z — 0, 

2x 4- y + 3z = 0, 

to see if it possesses a nontrivial solution. 

4.1 .3 Given the pair of equations 

x + 2 y — 3, 

2x H- 4j = 6. 

(a) Show that the determinant of the coefficients vanishes. 

(b) Show that the numerator determinants (Eq. 4.14) also vanish. 

(c) Find at least two solutions. 

4.1 .4 Express the components of A x B as 2 x 2 determinants. Show then that the dot 
product A* (A x B) yields a Laplacian expansion of a 3 x 3 determinant. Finally, 
note that two rows of the 3 x 3 determinant are identical and hence A • (A x B) = 0. 

4.1.5 If Qj is the cofactor of element a {j (formed by striking out the zth row and yth 
column and including a Sign (— l) i+/ ), show that 

(a) Ys a ijCij = Ya a jiCji = \A |, where | A | is the determinant with the elements a {j , 

i i 

( b ) = 'L a ji c k, = °- j + k. 

i i 

4.1.6 A determinant with all elements of order unity may be surprisingly small. The 
Hilbert determinant H {j = (/ -by — l) -1 , z, y = 1,2, . . . , n is notorious for its 
small values. 

(a) Calculate the value of the Hilbert determinants of order n for n — 1 , 2, and 3. 

(b) If an appropriate subroutine is available. Find the Hilbert determinants of 
order n for n ~ 4, 5, and 6. 
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ANS . n Pet (//„) 

1 1. 

2 8.33333 x 10' 2 

3 4.62963 x 1(T 4 

4 1.65344 x 10~ 7 

5 3.74930 x 1(T 12 

6 5.36730 x 10" 18 

4.1 .7 Solve the following set of linear simultaneous equations. Give the results to five 
decimal places. 

l.OXj 4~ 0.9x2 4~ O.8X3 4“ O.4X4 -f- 0.1x 5 —1.0 

0.9x t + 1.0x 2 4- O.8X3 + 0.5x 4 4- 0.2x 5 -f 0.1x 6 = 0.9 

0.8xj 4- 0.8x 2 4~ I.OX3 4~ 0.7x 4 4~ O.4X5 4~ 0.2x 6 = 0.8 

0.4xj + 0.5x 2 4~ O.7X3 4~ 1.0x 4 4~ 0.6x 3 4~ 0.3x 6 — 0.7 

O.lxj 4- 0.2x 2 + 0.4x 3 4- 0.6x 4 + 1.0x 5 4- 0.5x 6 = 0.6 

0.1x 2 4- 0.2x 3 4- 0.3x 4 4- 0.5x 5 4- 1.0x 6 — 0.5 
Note. These equations may also be solved by matrix inversion. Section 4.2. 


4.2 MATRICES 

Matrix analysis is essentially a theory of linear operations (linear algebra). 
Suppose, for instance, that a linear operator A is operating in a space that is 
described by the usual basis vectors i, j, and k. A operating on i transforms it 
into some linear combination of the basis vectors : 

Ai^ia^ +j a 2l + ka 31 . 

(In Section 4.3 the coefficients will be developed in detail for A, a rotation 
operator.) Similarly, the effect of A on j is given by a linear combination 
i a 12 4~ j a 12 + ka 32 , and on k by i a 13 + ja 23 + ka 33 . Then the effect of A on a 
vector u is to produce a vector v, 

v = Au 

= A(iu l + \u 2 + kw 3 ). 

Expanding, we obtain 

i^l K^ll u l + a \2 U 2 + a l3 U 3) 

+ P 2 = +j(« 2 i w i + a 22 u 2 + a 23 u 3 ) 

+ k *>3 + k(a 31 u 1 4- a 32 u 2 4- a 33 u 3 ). 

Equating the i components, we have 

3 

®i = I a U u r 
7 = 1 


or, in general, 
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= Z a ij U j’ ' = 2 > 3 ‘ ( 4 -22) 

j = 1 

We label the array of elements a matrix and take the summation of products 
in Eq. 4.22 as a definition of matrix multiplication (inner product). Before 
passing to formal definitions, the reader should note that operator A is 
described or characterized by its effect on the basis vectors. The matrix elements 
ciij constitute a representation of the operator, a representation that depends on 
the choice of the basis. 

Basic Definitions 

A matrix may be defined as a square or rectangular array of numbers or 
functions that obeys certain laws. This is a perfectly logical extension of familiar 
mathematical concepts. In arithmetic we deal with single numbers. In the theory 
of complex variables (Chapter 6) we deal with ordered pairs of numbers, 
(1,2) = 1 + 2 i, in which the ordering is important. We now consider numbers 
(or functions) ordered in a square or rectangular array. For convenience in 
later work the numbers are distinguished by two subscripts, the first indicating 
the row (horizontal) and the second indicating the column (vertical) in which 
the number appears. For instance, a 13 is the matrix element in the first row, 
third column. Hence, if A is a matrix with m rows and n columns, 



Perhaps the most important fact to note is that the elements a {j are not combined 
with one another. The matrix is not a determinant. It is an ordered array of 
numbers, not a single number. It makes no more sense to add or multiply all 
the a ( f s together than it does to write 1 + 2i = 3 ! 

The matrix A, so far just an array of numbers, has the properties we assign to 
it. Literally, this means constructing a new form of mathematics. We postulate 
that matrices A, B, and C, with elements a ij9 b ij9 and c {j , respectively, combine 
according to the following rules : 


Equality 

Matrix A = Matrix B if and only if a tj = b y for all values of i and j. This, 
of course, requires that A and B each be m by n arrays (m rows, n columns). 


Addition 

A + B = C if and only if a {j + b tj = c {j for all values of i and j, the elements 
combining according to the laws of ordinary algebra (or arithmetic if they are 
simple numbers). This means that A + B = B + A, commutation. Also, an 
associative law is satisfied (A + B)-hC = A + (B + C). 
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Multiplication (by a Scalar) 

The multiplication of matrix A by the scalar quantity a is defined as 

aA = (aA), 

in which the elements of a A are aa i7 ; that is, each element of matrix A is multi- 
plied by the scalar factor. This is in striking contrast to the behavior of deter- 
minants in which the factor a multiplies only one column or one row and not 
every element of the entire determinant. A consequence of this scalar multi- 
plication is that 

aA = Aa, commutation. 


Multiplication (Matrix Multiplication), Inner 
Product 

AB = C if and only if 1 c tj = Yu a ikKj- (4.23) 

k 

The ij element of C is formed as a scalar product of the zth row of A with the 
j th column of B (which demands that A have the same number of columns 
( n ) as B has rows). The dummy index k takes on all the values 1,2, . . . , n in 
succession, that is, 


Cij = a n b u + a i2 b 2j + a i3 b 3J (4.24) 

for n ~ 3. Obviously, the dummy index k may be replaced by any other symbol 
that is not already in use without altering Eq. 4.23. Perhaps the situation may 
be clarified by stating that feq. 4.23 defines the method of combining certain 
matrices. This method of combination, to give it a label, is called matrix 
multiplication. To illustrate, consider two matrices 

a ‘~(°i o) and as = (o -1} <4 ' 25) 


The u element of the product, (cr 1 a 3 ) 11 is given by the sum of the products 
of elements of the first row of a x with the corresponding elements of the first 
column of cr 3 : 


on/! 1 ! 


1 O/ViO 


0-1 + 1-0 = 0 . 


Continuing, we have 

/ 0-1 + 1-0 0-0 + 1 *(— 1 )\ _ /0 - 1 \ 
^-(1. 1+0-0 i-o + o-(-iy \i 0/ 


(4.26) 


Here 


(Oi0 3 )ij = o u a 3i . + a u a 32 .. 


Some authors follow the summation convention here (compare Section 3.1). 



MATRICES 179 


Direct application of the definition of matrix multiplication shows that 

o) 

and by Eq. 4.23 

Except in special cases, matrix multiplication is not commutative. 2 3 

AB ± BA. (4.29) 


(4.27) 


(4.28) 


However, from the definition of matrix multiplication we can show that an 
associative law holds, (AB)C = A(BC). There is also a distributive law, 
A(B + C) = AB + AC. 


Direct Product 

A second procedure for multiplying matrices, known as the direct tensor or 
Kronecker product , follows. If A is an m x m matrix and B an n x n matrix, 
then the direct product is 

A 0 B = C. (4.30) 

C is an mn x mn matrix with elements 

Q, = A i} B kl . (4.31) 

with 


a = n(i — 1) 4- k, = n(j — 1) + /. 
For instance, if A and B are both 2x2 matrices. 


A® B =| 


a 12 B 
^22 B 


a il b 11 

a \\b\2 

a \ib\\ 

a \ib\2 

a i \b 2\ 

a \\b 12 

a 12 b 2 i 

a l2 b 22 

a 2\b ll 

a i\bi2 

a 22^11 

a 22^12 

G 2ibn 

a 2 1 b 22 

a 22^21 

a 22^22. 


(4.32) 


The direct product is associative but not commutative. As an example of 
the direct product, the Dirac matrices of Section 4.5 may be developed as direct 


2 The reader should note that the basic definitions of equality, addition, and 
multiplication are given in terms of the matrix elements, the a f /s, and so on. 
All our matrix operations can be carried out in terms of the matrix elements. 
However, Cayley (1859) showed that we can also treat a matrix as a single 
algebraic operator, as in Eq. 4.29. Matrix elements and single operators each 
have their advantages as will be seen in the following section. We shall use 
both approaches. 

3 Commutation or the lack of it is conveniently described by the commutator 
bracket symbol, [A, B] = AB — BA. Equation 4.29 becomes [A, B] =£ 0. 
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products of the Pauli matrices and the unit matrix. Other examples appear in 
the construction of groups in group theory and in vector or Hilbert space in 
quantum theory. 

The direct product defined here is sometimes called the “ standard ” form 
and denoted by <g). Three other types of direct products of matrices exist as 
mathematical possibilities or curiosities but have little or no application in 
mathematical physics. 


Special Cases 

A number of matrices are of special interest. If the matrix has one column and 
n rows, it is called a column vector, |x> with components x f , / = 1,2, 

Similarly, if the matrix has one row and n columns, it is called a row vector, 
<x| with components x i9 i — 1,2, . . . , n. Clearly, if A is an n x n matrix, |x> 
an ^-component column vector, and <x| an ^-component row vector, 

A|x> and <x|A 

are defined by Eq. 4.23, whereas 

A<x| and |x> A 

are not defined. 

Clearly, the row vector <x| = (x l5 x 2 , . . . , x n ) and the column vector |x> 
with the same components are not independent. Just as clearly they cannot be 
added: <x| + |x> is not defined. In quantum theory it is convenient to consider 
the column vectors |x> in one space and the row vectors <x| in a different space, 
a dual space. 

In the remainder of this chapter we confine our attention to column vectors, 
row vectors, and square matrices. 

The unit matrix 1 has elements <5 ij5 Kronecker delta, and the property that 
1 A = A1 = A for all A. 


/ 1 

0 

0 

0 ■ 

• ■ \ 

1 0 

1 

0 

0 • 

• ■ \ 

1=0 

0 

1 

0 • 

• • . (4.33) 

\ 0 

0 

0 

1 • 

; ;/ 


If all elements are zero, the matrix is called the null matrix and is denoted 
by 0. For all A 

OA = AO = 0. 

/0 0 0 
0 0 0 

0 = 

0 0 0 

\. . . 

It should be noted that it is possible for the product of two matrices to be the 
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null matrix without either one being the null matrix. For example, if 


A = 



and 



AB = 0. Once more the results of ordinary algebra do not apply directly. 


Diagonal Matrices 

An important special type of matrix is the square matrix in which all the 
nondiagonal elements are zero. Specifically, if a 3 x 3 matrix A is diagonal, 

/flu 0 0 \ 

A = | 0 a 22 0 . 

\ 0 o a 33 J 

The physical interpretation of such diagonal matrices and the method of 
reducing matrices to this diagonal form are considered in Section 4.6. Here 
we simply note a significant property of diagonal matrices — multiplication of 
diagonal matrices is commutative, 

AB = BA, if A and B are each diagonal. 

Trace 

In any square matrix the sum of the diagonal elements is called the trace . 
One of its interesting and useful properties is that the trace of a product of 
two matrices A and B is independent of the order of multiplication : 

trace(AB) = S(AB), = ISa i A 

i i j 

= SZVu = S(BA), (4.35) 

j i j 

= trace (BA). 

This holds even though AB BA. Equation 4.35 means that the trace of any 
commutator bracket, [A, B] = AB — BA, is zero. 

In Exercise 4.5.23 the operation of taking the trace selects one term out of 
a sum of 16 terms. The trace will serve the same function relative to matrices 
as orthogonality serves for vectors and functions. 

In terms of tensors (Section 3.2) the trace is a contraction and like the 
contracted second-rank tensor is a scalar (invariant). 

Matrices are used extensively to represent the elements of groups (compare 
Exercise 4.2.7 and Sections 4.8 to 4.12). The trace of the matrix representing 
the group element is known in group theory as the character . The reason for 
the special name and special attention is that while the matrices may vary the 
trace or character remains invariant (compare Exercise 4.3.9). Finally, we note 
that as an operator the trace is a linear operator. 

Matrix Inversion 

At the beginning of this section matrix A is introduced as the representation 
of an operator that (linearly) transforms the coordinate axes. A rotation would 
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be one example of such a linear transformation. Now we look for the inverse 
transformation A" 1 that will restore the original coordinate axes. This means, 
as either a matrix or an operator equation, 4 

AA' 1 = A" 1 A = 1 . (4.36) 

From Exercise 4.2.32 


a” 1 - -Sl 
W t A 1 


(4.37) 


with the assumption that the determinant of A (| A|) f 0. If it is zero, we label 
A singular. No inverse exists. This conclusion that we must require | A| f 0 is 
about the only use of Eq. 4.37. As explained at the end of Section 4.1, this 
determinant form is totally unsuited for numerical work with large matrices. 

There is a wide variety of alternative techniques. One of the best and most 
commonly used is the Gauss-Jordan matrix inversion technique. The theory is 
based on the results of Exercises 4.2.34 and 4.2.35, which show that there exist 
matrices M L such that the product M L A will be A but with 

a. one row multiplied by a constant, or 

b. one row replaced by the original row minus a mul- 
tiple of another row, or 

c. rows interchanged. 


Other matrices M* Operating on the right (AM*) can carry out the same 
operations on the columns of A. 

This means that the matrix rows and columns may be altered (by matrix 
multiplication) as though we were dealing with determinants, so we can apply 
the Gauss-Jordan elimination techniques of Section 4.1 to the matrix elements. 
Hence there exists a matrix M L (or M^) such that 5 

M L A = 1 . (4.38) 

The M l = A' 1 . We determine M L by carrying out the identical elimination 
operations on the unit matrix. Then 

M l 1 = M l . (4.39) 

To clarify this, we consider a specific example. 

EXAMPLE 4.2. 1 Gauss-Jordan Matrix Inversion 


We want to invert the matrix 


4 Here and throughout this chapter our matrices have finite rank. If A is an 
infinite rank matrix (n x n with n -> oo), then life is more difficult. For A 1 
to be the inverse we must demand that both 

AA _1 = 1 and A~ l A='\. 

One relation no longer implies the other. 

5 Remember that det(A) =f= 0. 
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(4.40) 


For convenience we write A and 1 side by side and carry out the identical 
operations on each : 



and 


To be systematic, we multiply each row to get a kl = 1, 



0.6667 

1.5000 

1.0000 


0.3333' 

0.5000 

4.0000> 


and 



Subtracting the first row from the second and third, we obtain 


0.6667 

0.8333 

0.3333 


0.3333' 

0.1667 

3.6667V 


and 



(4.41) 


(4.42) 


(4.43) 


Then we divide the second row (of both matrices) by 0.8333 and subtract 0.6667 
times it from the first row, and 0.3333 times it from the third row. The results 
for both matrices are 


( 1 0 0.2000\ / 0.6000 - 0.4000 

0 1 0.2000 1 and I -0.4000 0.6000 

0 0 3.6000/ \- 0.2000 -0.2000 

We divide the third row (of both matrices) by 3.6. Then as the last step 0.2 
times the third row is subtracted from each of the first two rows (of both 
matrices). Our final pair is 

( 1 0 0\ / 0.6111 -0.3889 -0.0556\ 

0 1 Oj and (-0.3889 0.6111 -0.0556 ). (4.45) 

0 0 1/ \— 0.0556 -0.0556 0.2778/ 

The check is to multiply the original A by the calculated A -1 to see if we 
really do get the unit matrix 1 . The result to four decimal places is 

( 0.9999 -0.0001 -0.0002 \ 

-0.0001 0.9999 -0.0002 1 (4.46) 

- 0.0002 - 0.0002 1.0000 / 

or 1 , the unit matrix to within the round-off error (mostly from rounding off 
-0.05555- • • to -0.0556). 

As with the Gauss-Jordan solution of simultaneous linear algebraic equa- 


0 I . (4.44) 
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tions, this technique is well adapted to large computing machines. Indeed, this 
Gauss-Jordan matrix inversion technique will probably be available in the 
program library as a subroutine. 


EXERCISES 


4.2.1 Show that matrix multiplication is associative, (AB)C = A(BC). 

4.2.2 Show that 

(A + B)(A - B) = A 2 - B 2 
if and only if A and B commute, 

[A, B] = 0. 

4.2.3 Show that matrix A is a linear operator by showing that 

A(c 1 t 1 + c 2 r 2 ) = c l Ar l 4- c 2 Ar 2 . 

It can be shown that an n x n matrix is the most general linear operator in 
an ^-dimensional vector space. This means that every linear operator in this 
^-dimensional vector space is equivalent to a matrix. 

4.2.4 (a) Complex numbers, a + ib, with a and b real, may be represented by (or, 

are isomorphic with) 2x2 matrices : 

{a b 
a + ib<-+\ 

\—b a 

Show that this matrix representation is valid for (i) addition and (ii) multi- 
plication. 

(b) Find the matrix corresponding to {a + ib)~ l . 


4.2.5 

4.2.6 


4.2.7 


If A is an n x n matrix, show that 

det( — A) = ( — l)”det A. 

(a) Matrix C is the matrix product of A and B . Show that the determinant of 
C is the product of the determinants of A and B. 

det C = det A x det B. 

Hint. The determinant can be written 

^ijk^il^j2^k3 |A|- 

(b) If C = A + B, in general, 

det C =£ det A + det B. 

Construct a specific numerical example to illustrate this inequality. 


Given the three matrices 




and 




Find all possible products of A, B, and C, two at a time, including squares. 
Express your answers in terms of A, B, and C, and 1 , the unit matrix. 
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These three matrices together with the unit matrix form a representation of a 
mathematical group, the vierergruppe . 

Sections 4.8 and 4.9 (Group Theory) contain repeated references to this group. 


4 . 2.8 Given 


show that 



K" = KKK (n factors) = 1 
(with the proper choice of n, n =/= 0). 


4 . 2.9 Verify the Jacobi identity 

[A, [B, C]] = [B, [A, C]] — [C, [A, B]]. 

This is useful in matrix descriptions of elementary particles. As a mnemonic 
aid, the reader might note that the Jacobi identity has the same form as the BAC- 
CAB rule of Section 1.5. 


4 . 2.10 Show that the matrices 


A = 



/0 0 0 \ 

B=j 0 0 l), 

\0 0 0 / 


( 0 0 
0 0 
0 0 


1 

0 

0 


satisfy the commutation relations 

[A, B] = C, [A, C] = 0, and [B,C] = 0. 


4 . 2.11 Let 



Show that 

(a) i 2 = j 2 = k 2 = — 1 , where 1 is the unit matrix. 

(b) ij = — ji = k, 
jk= — kj= i, 
k i — — i k = j. 

These three matrices (i, j, and k) plus the unit matrix 1 form a basis for qua- 
ternions. An alternate basis is provided by the four 2x? matrices, icr { , ia 2 , 
— io 3 , and 1 , where the cr’s are the Pauli spin matrices of Exercise 4.2.13. 


4 . 2.12 A matrix with elements a i} — 0 for j < i may be called upper right triangular. 

The elements in the lower left (below and to the left of the main diagonal) vanish. 
Examples are the matrices in Chapters 12 and 13 relating power series and 
eigenfunction expansions. 



186 DETERMINANTS, MATRICES, AND GROUP THEORY 


Show that the product of two upper right triangular matrices is an upper right 
triangular matrix. 

4 . 2.1 3 The three Pauli spin matrices are 



Show that 

(a) of = 1 , 

(b) o i o i — io k , (ij, k) = (1,2, 3), (2, 3, 1), (3, 1 , 2) (cyclic permutation), 

(c) o i o ] + CJjCJi = 2<7y1. 

These matrices were used by Pauli in the nonrelativistic theory of electron spin. 


4 . 2.14 Using the Pauli o ' s of Exercise 4.2.13, show that 

(a * a)(a • b) = a * bl -f ia • (a x b). 

Here 

<y - i o x + j o y + k o z 
and a and b are ordinary vectors. 

4 . 2.1 5 One description of spin 1 particles uses the matrices 


M = 






and 



M = 


Show that 

(a) [ M x , M y] = i M z , and so on 6 (cyclic permutation of indices). 
Using the Levi-Civita symbol of Section 3.4, we may write 

[Mi, = ie iJk M k . 

(b) M 2 = M 2 X + M 2 f + M 2 . = 21, 
where 1 is the unit matrix. 

(c) [M 2 , MJ = 0, 

[M z , L + ] = L + , 

[L + ,L'] = 2M 2 , 

where 
L + = + 

L" = M -iM v . 


4 . 2.16 Repeat Exercise 4.2.15 using an alternate representation, 


M = 


and 




[A, B] = AB — BA. 
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/O -i 0\ 

M 2 = I i 0 0 j . 

\0 0 0 / 

In Section 4.1 1 these matrices appear as the generators of the rotation matrices. 


4.2.1 7 Show that the matrix-vector equation 

V cSt ) 

reproduces Maxwell’s equations in vacuum. Here ip is a column vector with 
components ij/j ~ B } — iEj/c , j = x, y, z. M is a vector whose elements are the 
angular momentum matrices of Exercise 4.2.16. Note that £ 0/ u 0 = 1/c 2 . 

From Exercise 4.2.15(b) 

MV = 21 1//. 

A comparison with the Dirac relativistic electron equation suggests that the 
“particle” of electromagnetic radiation, the photon, has zero rest mass and 
a spin of 1 (in units of h ). 


4.2.1 8 Repeat Exercise 4.2.15, using the matrices for a spin of §, 


M = 




and 


M z 



0 

1 

0 

0 



4.2.1 9 An operator P commutes with J x and J r the x and y components of an angular 
momentum operator. Show that P commutes with the third component of 
angular momentum; that is, 

[P,JJ = 0. 

Hint. The angular momentum components must satisfy the commutation relation 
of Exercise 4.2.15(a). 


4.2.20 The L + and L~ matrices of Exercise 4.2.15 are “ladder operators.” L + operating 
on a system of spin projection m will raise the spin projection to m + 1 if m is 
below its maximum. L + operating on m yields zero. L~ reduces the spin 
projection in unit steps in a similar fashion. Dividing by y/2, we have 

( 0 1 0 \ /0 0 0 \ 

0 0 1 j, L - I 1 0 0 j. 

0 0 0/ \0 1 0/ 

Show that 

L + 1 — 1 ) = |0>, L~ | — 1 > = null column vector, 

L + |0> = |1>, L-|0> = |-1>, 

L + 1 1 > = null column vector, L“ 1 1 ) = |0>, 
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where 



representing states of spin projection —1,0, and 1, respectively. 

Note. Differential operator analogs of these ladder operators appear in Exercise 
12.6.7. 

4.2.21 Vectors A and B are related by the tensor T 

B — TA. 

Given A and B show that there is no unique solution for the components of T. 
This is why vector division B/A is undefined (apart from the special case of 
A and B parallel and T then a scalar). 

4.2.22 We might ask for a vector A -1 , an inverse of a given vector A in the sense that 

A - A" 1 = A -1 * A = 1. 

Show that this relation does not suffice to define A -1 uniquely. A has literally 
an infinite number of inverses. 

4.2.23 If A is diagonal, with all diagonal elements different, and A and B commute, 
show that B is diagonal. 

4.2.24 If A and B are diagonal, show that A and B commute. 

4.2.25 Show that trace (ABC) = trace (C B A) if any two of the three matrices commute. 

4.2.26 Angular momentum matrices satisfy a commutation relation 

[ M t -, M;] — i M k , i, j, k cyclic. 

Show that the trace of each angular momentum matrix vanishes. 

4.2.27 (a) The operator Tr replaces a matrix A by its trace; that is, 

Tr(A) = trace(A) = Y. a u- 

i 

Show that Tr is a linear operator. 

(b) The operator det replaces a matrix A by its determinant; that is, 

det(A) = determinant of A. 

Show that det is not a linear operator. 

4.2.28 A and B anticommute. Also, A 2 = 1 , B 2 = 1 . Show that trace(A) = trace( B) = 0. 
Note. The Pauli and Dirac (Section 4.5) matrices are specific examples. 

4.2.29 With \x} an ^dimensional column vector and (y \ an A-dimensional row vector, 
show that 

trace(|x>0|) = <y|x>. 

Note. \x)(y\ means column vector |v> multiplying row vector (y |. The result 
is a square matrix N x N. 

4.2.30 (a) If two nonsingular matrices anticommute, show that the trace of each 

one is zero. (Nonsingular means that the determinant of the matrix elements 

* 0 .) 
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(b) For the conditions of part (a) to hold A and B must be n x n matrices with 
n even. Show that if n is odd a contradiction results. 

4 . 2.31 If a matrix has an inverse, show that the inverse is unique. 

4 . 2.32 If A' 1 has elements 


where C jf is the y7th cofactor of | A|, show that 

A _1 A = 1 . 

Hence A' 1 is the inverse of A (if ( A j =/= 0). 

Note. In numerical work it sometimes happens that |A| is almost equal to zero. 
Then there is trouble. 

4 . 2.33 Show that det A' 1 = (det A) -1 . 

Hint. Apply Exercise 4.2.6. 

Note. If det A is zero, then A has no inverse. A is singular. 

4 . 2.34 Find the matrices M L such that the product M L A will be A but with: 

(a) the ith row multiplied by a constant k , (< a y -* ka ijy j — 1, 2, 3, . . .). 

(b) the zth row replaced by the original /th row minus a multiple of the mth 
row, {a t j - a u - ka mj J = 1, 2, 3, . . .). 

(c) the i th and mth rows interchanged, ( a -* a mj , a mj -► a ip j — 1, 2, 3, . . .). 

4 . 2.35 Find the matrices M* such that the product A M* will be A but with: 

(a) the z th column multiplied by a constant k, (a ji ka jh j — 1, 2, 3, . . .). 

(b) the /th column replaced by the original i th column minus a multiple of the 
mth column, {a }i -+ a j{ — ka jm J — 1, 2, 3, . . .). 

(c) the i th and mth columns interchanged, -*• a jm , a jm -> a jh j — 1, 2, 3, . . .). 

4 . 2.36 Find the inverse of 

( 3 2 1 
2 2 1 
1 1 4 


4 . 2.37 (a) Rewrite Eq. 2.4 of Chapter 2 (and the corresponding equations for dy and 
dz) as a single matrix equation 

\dx k ~) = J| dqj}. 

J is a matrix of derivatives, the Jacobian matrix. Show that 
<dx k \dx k ) = <d gi \G\d qj y 

with the metric (matrix) G having elements g u given by Eq. 2.6. 

(b) Show that 


det (J)dq 1 dq 2 dq 3 ~ dx dy dz. 
Det(J) is the usual Jacobian. 


4 . 2.38 Matrices are far too useful to remain the exclusive property of physicists. They 
may appear wherever there are linear relations. For instance, in a study of 
population movement the initial fraction of a fixed population in each of n 
areas (or industries or religions, etc.) is represented by an ^-component column 
vector P. The movement of people from one area to another in a given time is 
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described by an n x n (stochastic) matrix T. Here is the fraction of the popula- 
tion in the yth area that moves to the /th area. (Those not moving are covered 
by i — j.) With P describing the initial population distribution, the final popula- 
tion distribution is given by the matrix equation TP = Q. 

n 

From its definition £ = 1. 

i= 1 

(a) Show that conservation of people requires that 

I ?y= 1, y = l, 2, «. 

i=l 

(b) Prove that 

ta = i 

i— 1 

continues the conservation of people. 

4.2.39 Given a 6 x 6 matrix A with elements a tj = i = 0, 1,2, . . . , 5;j — 0, 1, 

2, . . . , 5. Find A F List the matrix elements a p ^ to five decimal places. 



4 . 2.40 Exercise 4.1.7 may be written in matrix form 

AX = C. 

Find A 1 and calculate X as A _1 C. 

4 . 2.41 (a) Write a subroutine that will multiply complex matrices. Assume that the 

complex matrices are in a general rectangular form. 

(b) Test your subroutine by multiplying pairs of the Dirac 4x4 matrices of 
Table 4.1, Section 4.5. 

4 . 2.42 (a) Write a subroutine that will call the complex matrix multiplication sub- 

routine of Exercise 4.2.41 and will calculate the commutator bracket of 
two complex matrices. 

(b) Test your complex commutator bracket subroutine with the matrices of 
Exercise 4.2.16. 

4 . 2.43 Interpolating polynomial is the name given to the ( n — l)-degree polynomial 
determined by (and passing through) n points, (x^y,) with all the x t -’s distinct. 
This interpolating polynomial forms the basis for the numerical quadrature 
developed in Appendix 2. 

(a) Show that the requirement that an ( n — l)-degree polynomial in x passes 
through each of the n points (x f , y\) with all x f distinct leads to n simultaneous 
equations of the form 

M-i 

Z a i x ! = «'= 1,2, 

j= o 

(b) Write a computer program that will read in n data points and return the 
n coefficients a y Use a subroutine to solve the simultaneous equations if 
such a subroutine is available. 
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(c) Rewrite the set of simultaneous equations as a matrix equation 

XA = Y. 

(d) Repeat the computer calculation of part (b) but this time solve for vector 
A by inverting matrix X (again, using a subroutine). 

4 . 2.44 A calculation of the values of electrostatic potential inside a cylinder leads to 
F(0.0) = 52.640 F(0.6) - 25.844 

F(0.2) = 48.292 K(0.8) = 12.648 

K(0.4) = 38.270 K(1.0) = 0.0 

The problem is to determine the values of the argument for which V = 10, 20, 

5 

30, 40, and 50. Express V(x) as a series £ a ln x 2n . (Symmetry requirements in 

n=0 

the original problem require that V(x) be an even function of x.) Determine the 
coefficients a 2n . With V(x) now a known function of a\ find the root of V(x) — 10 
= 0, 0 < x < 1 . Repeat for V(x) — 20, and so on. 

ANS. a 0 = 52.640 

a 2 = -117.676 
F(0.6851) - 20. 


4.3 ORTHOGONAL MATRICES 

Ordinary three-dimensional space may be described with the familiar 
cartesian coordinates (x,y, z). We consider a second set of cartesian coordinates 
(x\y\z') whose origin coincides with that of the first set but whose orientation 
is different (Fig. 4.1). We can say that the primed coordinate axes have been 
rotated relative to the initial, unprimed coordinate axes. Since this rotation 
is a linear operation, we expect a matrix equation relating the primed basis to 
the unprimed basis. 

This section repeats portions of Chapters 1 and 3 in a slightly different 
context and with a different emphasis. Previously, attention was focused on 
the vector or tensor. In the case of the tensor, transformation properties were 
strongly stressed and were critical. Here emphasis is placed on the description 
of the coordinate rotation itself — the matrix. Transformation properties, the 
behavior of the matrix when the basis is changed, appear at the end of this 
section. Sections 4.5 and 4.6 continue with transformation properties in complex 
vector spaces. 

Direction Cosines 

A unit vector along the x'-axis (F) may be resolved into components along 
the x-, y-, and z-axes by the usual projection technique. 

i' = icos(x',^) 4- jcos(x',>>) T kcos(;F,z). (4.47) 

Equation 4.47 is a specific example of the linear relations discussed at the 
beginning of section 4.2. 

For convenience these cosines, which are the direction cosines, are labeled 
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cos(X,.x) = i'-i = <z 11? 

cos (x',y) = i'-j = fli 2 , (4.48) 

cos(.x',z) = i'*k = a 13 . 

Continuing, we have 

cos(/,x) = j'*i = a 2l , (a 2 i =/= a i2 ), 

(4.49) 

cos(/,j) = }• j = a 22 , and so on. 

Now Eq. 4.47 may be rewritten 

i' = i a u + ja 12 + ka 13 

and also 

j' = ia 21 + jfl 22 + ka 23 , 

(4.50) 

k' = ia 31 + ja 32 + ka 33 . 

We may also go the other way by resolving i, j, and k into components in the 
primed system. Then 

i = i'a n +j'a 21 +k'a 31 , 
j = i'a 12 +j'a 22 + k'a 32 , 
k = i'a 13 + j'a 23 +k'a 33 . 


(4.51) 
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Associating i and i' with the subscript 1 , j and j' with the subscript 2, k and k' 
with the subscript 3, we see that in each case the first subscript of a i} refers to 
the primed unit vector (i',j',k'), whereas the second subscript refers to the 
unprimed unit vector (i, j, k). 


Applications to Vectors 

If we consider a vector whose components are functions of the position in 
space, then 

V(x,y 9 z) = iV x +\V y + kV z 

(4.52) 

= v\x\y\z') = rv'. + yv; + k'v;, 

since the point may be given both by the coordinates (x, y , z) and the coordinates 
(x\y\ z'). Note that V and V" are geometrically the same vector (but with differ- 
ent components). The coordinate axes are being rotated; the vector stays fixed. 
Using Eq. 4.50 to eliminate i, j, and k, we may separate Eq. 4.52 into three 
scalar equations. 


and 


K' — a w K + K H" a i3 K 
Vy' — a 2 i ^ + a 22 V y -f a 23 V z 


Vz’ — a 3 1 K + a 32 K + a 33 V % . 

In particular, these relations will hold for the coordinates of a point (x,y,z) 
and (*',/, z'), giving 


x' = a tl x + a 12 y + a 13 z, 
y' = a 2 i* + a 22 y + a 23 z, 


(4.54) 


and 


^ / = « 3 i x + a 32 y + CI 33 Z. 

It is convenient to change the notation slightly at this point. 

Let 

* ->x l9 

y-+x 2 , (4.55) 


z — > x 3 , 

and similarly for the primed coordinates. In this notation the set of three 
equations (4.54) may be written as 

3 

= Z a u x j’ < 4 - 56 ) 

j=i 

where i takes on the values 1 , 2, and 3 and the result is three separate equations. 

Now let us set aside these results and try a different approach to the same 
problem. We consider two coordinate systems (x x , x 2 , x 3 ) and (x\ , x 2 ,x' 3 ) with 
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a common origin and one point (x t , x 2 , x 3 ) in the unprimed system, (x\ , x ' 2 , x 3 ) 
in the primed system. Note the usual ambiguity. The same symbol x denotes 
both the coordinate axis and a particular distance along that axis. Since our 
system is linear, x\ must be a linear combination of the x^s. Let 

A = Z a u x i- < 4 - 57 ) 

j-i 

The ciij may be identified as our old friends, the direction cosines. This identifi- 
cation is carried out for the two-dimensional case later. 

If we have two sets of quantities (F 1? F 2 , V 3 ) in the unprimed system and 
(V[, F 2 , V 3 ) in the primed system, related in the same way as the coordinates 
of a point in the two different systems (Eq. 4.57), 

V{ = t (4.58) 

7 = 1 

then, as in Section 1.2, the quantities (V 1 , F 2 , V 3 ) are defined as the components 
of a vector ; that is, a vector is defined in terms of transformation properties 
of its components under a rotation of the coordinate axes. In a sense the 
coordinates of a point have been taken as a prototype vector. The power and 
usefulness of this definition becomes apparent in Chapter 3, in which it is 
extended to define pseudovectors and tensors. 

From Eq. 4.56 we can derive interesting information about the a t /s which 
describe the orientation of coordinate system (x \ , x 2 , x 3 ) relative to the system 
, x 2 , x 3 ). The length from the origin to the point is the same in both systems. 
Squaring, for convenience, 

2 >? = I>; 2 

i i 

= Z Cl a v x i) (Z 1 (4-59) 

i j k 

= Z*/**Z<w 

j,k i 

This can be true for all points if and only if 

Z a ij a i k = tjk, U k = 1,2, 3. (4.60) 

i 

Verification of Eq. 4.60, if needed, may be obtained by returning to Eq. 4.59 
and setting r = (x 1 ,x 2 ,x 3 ) = (1,0,0), (0, 1,0), (0,0, 1), (1,1,0), and so on to 
evaluate the nine relations given by Eq. 4.60. This process is valid, since Eq. 4.59 
must hold for all r for a given set of a^. Equation 4.60, a consequence of requiring 
that the length remain constant (invariant) under rotation of the coordinate 
system, is called the orthogonality condition . The s, written as a matrix A, 
form an orthogonal matrix. Note carefully that Eq. 4.60 is not matrix multi- 
plication. Rather, it is interpreted later as a scalar product of two columns 
of A. 


Note that two independent indices j and k are used. 
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In matrix notation Eq. 4.56 becomes 

|jc'> = A|x>. (4.61) 


Orthogonality Conditions — Two-Dimensional Case 

A better understanding of the a t J s and the orthogonality condition may be 
gained by considering rotation in two dimensions in detail. (This can be thought 
of as a three-dimensional system with the x { - x 2 -axes rotated about x 3 .) From 
Fig. 4.2 



FIG. 4.2 


Therefore by Eq. 4.61 


x\ = x x cos cp + x 2 sin cp, 
x' 2 — — x x sin cp + x 2 cos cp. 


A = 


cos cp 
— sin (p 


sirup > 
cos cp / 


(4.62) 


(4.63) 


Notice that A reduces to the unit matrix for cp = 0. Zero rotation means nothing 
has changed. It is clear from Fig. 4.2 that 


a t x = cos cp — cos(x / 1 , x x ) 9 


a u 


= sin cp = cos 



cos(Xi , x 2 ), and so on, 


(4.64) 


thus identifying the matrix elements a tj as the direction cosines. Equation 4.60, 
the orthogonality condition, becomes 

sin 2 cp -b cos 2 cp — 1, 

sin cp cos cp — sin cp cos cp = 0. 


(4.65) 
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The extension to three dimensions (rotation of the coordinates through an 
angle (p counterclockwise about x 3 ) is simply 


( cos<p 
— sirup 
0 


sin<p 
cos cp 
0 



(4.66) 


The a 33 = 1 expresses the fact that x 3 = x 3 , since the rotation has been about 
the x 3 -axis. The zeros guarantee that x\ and x' 2 do not depend on x 3 and that 
x 3 does not depend on x x and x 2 . In more sophisticated language, x l and x 2 
span an invariant subspace , whereas x 3 forms an invariant subspace alone. The 
general form of A is reducible. Equation 4.66 gives one possible decomposition. 


Inverse Matrix, A 1 


Returning to the general transformation matrix A, the inverse matrix A 1 is 
defined such that 


X 

N/' 

II 

> 

1 

N/ 

(4.67) 

That is, A' 1 describes the reverse of the rotation given by A and returns the 
coordinate system to its original position. Symbolically, Eqs. 4.61 and 4.67 
combine to give 


|x> = A -1 A|*>, 

(4.68) 

and since x> is arbitrary, 

A“ 1 A = 1, 

(4.69) 

the unit matrix. Similarly, 

AA -1 = 1, 

(4.70) 


using Eqs. 4.61 and 4.67 and eliminating \x} instead of 


Transpose Matrix, A 

We can determine the elements of our postulated inverse matrix A" 1 by 
employing the orthogonality condition. Equation 4.60, the orthogonality con- 
dition, does not conform to our definition of matrix multiplication, but it can 
be put in the required form by defining a new matrix A such that 

“ji = a f (4.71) 

that is, A, called “A transpose,” 2 is formed from A by interchanging rows 
and columns. Equation 4.60 becomes 

AA = 1 . (4.72) 

This is a restatement of the orthogonality condition and may be taken as a 
definition of orthogonality. Multiplying Eq. 4.72 by A" 1 from the right and 


2 Some texts denote A transpose by A T . 
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using Eq. 4.70, we have 


A = A" 1 . 

(4.73) 

This important result that the inverse equals the transpose holds only for 
orthogonal matrices and indeed may be taken as a further restatement of the 
orthogonality condition. 

Multiplying Eq. 4.73 by A from the left, we obtain 

AA = 1 

(4.74) 

or 


i 

(4.75) 

which is still another form of the orthogonality condition. 

Summarizing, the orthogonality condition may be stated in 
ways : 

several equivalent 

Z a ij a ik = $jk 
i 

(4.76a) 

''O' 

II 

sT 

w- 

(4.766) 

Aa = aA = i 

(4.76c) 

A = A" 1 . 

(4.76*0 


Any one of these relations is a necessary and a sufficient condition for A to be 
orthogonal. 

It is now possible to see and understand why the term orthogonal is appro- 
priate for these matrices. We have the general form 



a matrix of direction cosines in which a tj is the cosine of the angle between x\ and 
Xj. Therefore a ll9 a 12 , a 13 are the direction cosines of x\ relative to x l9 x 2 , x 3 . 
These three elements of A define a unit length along x \ , that is, a unit vector i\ 


r = ia lx + j ja 12 + ka 13 . 


The orthogonality relation (Eq. 4.75) is simply a statement that the unit vectors 
i\ j', and k' are mutually perpendicular or orthogonal. Our orthogonal trans- 
formation matrix A rotates one orthogonal coordinate system into a second 
orthogonal coordinate system. 

As an example of the use of matrices, the unit vectors in spherical polar 
coordinates may be written as 



(4.77) 



198 DETERMINANTS, MATRICES, AND GROUP THEORY 


where C is given in Exercise 2.5.1. This is equivalent to Eq. 4.50 with i\ j', and 
k' replaced by r 0 , 0 O , and q> 0 . From the preceding analysis C is orthogonal. 
Therefore the inverse relation becomes 

L ) =c ’ 1 ( e °) =e ( 0o \ (478) 

\V \<Po / \<Po / 

and Exercise 2.5.5 is solved by inspection. Similar applications of matrix 
inverses appear in connection with the transformation of a power series into 
a series of orthogonal functions (Gram-Schmidt orthogonalization) and the 
numerical solution of integral equations. 


Successive Rotations, Matrix Multiplication 
Returning to orthogonal matrices, let the coordinate rotation 

|jO = A|*> (4.79) 

be followed by a second rotation given by matrix B such that 

|x"> = B|x'>. (4.80) 


In component form 

*? = E b u x 'i 

j 

= Y b ij'L a jk x k ( 4 - 81 ) 

j k 

= I(EVjk) x *' 

k j 

The summation over j is matrix multiplication defining a matrix C = BA 
such that 

Xi=Y.c ik x k . (4.82) 

k 


Again, the definition of matrix multiplication is found useful and indeed this is 
the justification for its existence. The physical interpretation is that the matrix 
product of the two matrices, BA, is the rotation that carries the unprimed 
system directly into the double-primed coordinate system. 


Euler Angles 

Our transformation matrix A contains nine direction cosines. Clearly, only 
three of these are independent, Eq. 4.60 providing six constraints. Equivalently, 
we may say that two parameters (6 and cp in spherical polar coordinates) are 
required to fix the axis of rotation. Then one additional parameter describes 
the amount of rotation about the specified axis. In the Lagrangian formulation 
of mechanics (Section 17.3) it is necessary to describe A by using some set of 
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FIG. 4.3 {a) Rotation about x 3 through angle a; (b) Rotation about x' 2 through 

angle /? ; (c) Rotation about x 3 through angle y. 


three independent parameters rather than the redundant direction cosines. The 
usual choice of parameters is the Euler angles. 3 

The goal is to describe the orientation of a final rotated system (x'{\ x 2 , x 3 ) 
relative to some initial coordinate system ( x 1? x 2 ,x 3 ). The final system is 
developed in three steps — each step involving one rotation described by one 
Euler angle (Fig. 4.3): 

1. The x\- 9 x 2 -, x 3 -axes are rotated about the x 3 -axis 
through an angle a counterclockwise relative to x l9 
x 2 , x 3 . (The x 3 - and x 3 -axes coincide.) 

2. The x'[- y x 2 -, x 3 -axes are rotated about the x 2 -axis 4 
through an angle /? counterclockwise relative to x \ , 
x 2 , x 3 . (The x' 2 - and the x 2 -axes coincide). 

3. The third and final rotation is through an angle y 
counterclockwise about the x 3 -axis, yielding the x'{', 
x 2 , x 3 system. (The x 3 - and x 3 -axes coincide.) 


The three matrices describing these rotations are 

/ cos a sin a 0^ 
^ z ( a ) = I —sin a cos a 0 

\ 0 0 T 

exactly like Eq. 4.66, 


R ,(0 = I 


(4.83) 


COS p 

0 

— sin p\ 


0 

1 

° 

(4.84) 

sin/? 

0 

cos/? / 



and 


3 There are almost as many definitions of the Euler angles as there are authors. 
Here we follow the choice generally made by workers in the area of group 
theory and the quantum theory of angular momentum (compare Section 4.9). 

4 Many authors choose this second rotation to be about the x\ -axis. 
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R,(y) 


( cosy 
— siny 
0 


siny 

cosy 

0 



(4.85) 


The total rotation is described by the triple matrix product. 

A (a,Ay)=B,(y)R,(«R,(«). (4.86) 

(The component form of successive transformations is considered in Eqs. 4.79 
to 4.82.) 

Note the order: R z (a) operates first, then R y (p), and finally R z (y). Direct 
multiplication gives 


A (a, p, y) 


( cos y cos P cos a — sin y sin a 
— sin y cos P cos a — cos y sin a 
sin P cos a 


cos y cos p sin a -f sin y cos a 
— sin y cos p sin a + cos y cos a 
sin P sin a 


— cosy sin p y 
sin y sin p 
cos P 


(4.87) 


Equating A(a y ) with A(a,/?,y), element by element, yields the direction 
cosines in terms of the three Euler angles. We could use this Euler angle identi- 
fication to verify the direction cosine identities, Eq. 1.41, of Section 1.4, but 
the approach of Exercise 4.3.3 is much more elegant. 


TWO TECHNIQUES 

Our matrix description of rotation leads to the O 3 group, which will be 
discussed in Sections 4.10 and 4.11. Rotations may also be described by the 
SU(2) group and quaternions. The power and flexibility of matrices pushed 
quaternions into obscurity early in this century . 5 The SU(2) concepts and 
techniques are often encountered in modern particle physics. The SU(2) group 
is also considered in Sections 4.10 and 4. 1 1. 

The Euler angle description of rotations forms a basis for developing the 
rotation group of Section 4.10. 

It will be noted that the matrices have been handled in two ways in the fore- 
going discussion: by their components and as single entities. Each technique 
has its own advantages. Both are useful. 

Consider the evaluation of (ST)" 1 where ST is a (product) matrix that has 
an inverse. Then, clearly, 

(STXST)" 1 = 1. 


5 Stephenson, R. J., “Development of Vector Analysis from Quaternions.” 
Am. J. Phys . 34, 194 (1966). 
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Multiplying first by S 1 and then by T 1 successively from the left, we have 

(ST)' 1 = T _1 S _1 . (4.88) 

The inverse of a product equals the product of the inverses in reverse order . 
This may be readily generalized to any number of factors. 

On the other hand, the evaluation of (ST) may perhaps best be carried out 
by considering the components. Let U = ST, with S, T, and U not necessarily 
orthogonal. Then 

U ik = Yj S ij*jk 
j 

— X ^hj^jh 
j 

using the definition of transpose. But 

Uik 


and Eq. 4.83 may be written as 

(ST) = TS. (4.89) 

The transpose of a product equals the product of the transposes in reverse order. 
Note that in the two illustrations neither S nor T is required to be orthogonal. 

Symmetry Properties 

The transpose matrix is useful in a discussion of symmetry properties. If 

A = A, a i} = a jh (4.90) 

the matrix is called symmetric , whereas if 

A = — A, a u =-a ji9 (4.91) 

it is called antisymmetric or skewsymmetric. The diagonal elements vanish. 
It is easy to show that any (square) matrix may be written as the sum of a 
symmetric matrix and an antisymmetric matrix. Consider the identity 

A = i[A 4- A] + ±[A - A]. (4.92) 

[A T A] is clearly symmetric, whereas [A — A] is clearly antisymmetric. This 
is the matrix analog of Eq. 3.22, Chapter 3, for tensors. 

Similarity Transformation 

So far we have interpreted the orthogonal matrix as rotating the coordinate 
system. This changes the components of a fixed vector (not rotating with the 
coordinates) (Fig. 1.7, Chapter 1). However, Eq. 4.89 may be interpreted equally 
well as a rotation of the vector in the opposite direction (Fig. 4.4). 

These two possibilities: (1) rotating the vector keeping the basis fixed and 
(2) rotating the basis (in the opposite sense) keeping the vector fixed have a 
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7 



FIG. 4.4 Fixed coordmates-rotated vector 


direct analogy in quantum theory. Rotation (a time transformation) of the 
state vector gives the Schrodinger picture. Rotation of the basis keeping the 
state vector fixed yields the Heisenberg picture. 

Suppose we interpret matrix A as rotating a vector r into the position shown 
by iv 

r t = Ar. (4.93) 

Now let us rotate the coordinates by applying matrix B, which rotates (x,y, z) 
into (*',/, O, 


Brj = BAr 

= BA(B~‘B)r (4.94) 

= (BAB -1 ) Br. 

Br x is just r, in the new coordinate system with a similar interpretation holding 
for Br. Hence in this new system (Br) is rotated into position ( B r , ) by the 
matrix BAB" 1 . 

Br t =(BAB" 1 )Br 

I i 1 

rj = A' r' 

In the new system the coordinates having been rotated by matrix B, A has the 
form A', in which 
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A' = BAB -1 . 

(4-95) 

A' operates in the x\ y\ z' space as A operates in the x, y , z space. 

The transformation defined by Eq. 4.95 with B any matrix, not necessarily 
orthogonal, is known as a similarity transformation. In component form Eq. 
4.95 becomes 


a’ij = Z KakAj 1 . 

k,l 

(4.96) 

Now if B is orthogonal. 



<> 

d i 

II 

II 

(4.97) 

and we have 

a ij “ Y^^ik^ji a kl- 
k,l 

(4.98) 


It may be helpful to think of A again as an operator, possibly as rotating 
coordinate axes, relating current density and electric fields in an anisotropic 
crystal (Section 3.1) or angular momentum and angular velocity of a rotating 
solid (Section 4.6). Matrix A is the representation in a given coordinate system — - 
or basis. But there are directions associated with A — crystal axes, symmetry 
axes in the rotating solid, and so on — so that the representation A depends 
on the basis. The similarity transformation shows just how the representation 
changes with a change of basis. 


Relation to Tensors 

Comparing Eq. 4.98 with the equations of Section 3.1, we see that it is the 
definition of a tensor of second rank. Hence a matrix that transforms by an 
orthogonal similarity transformation is, by definition, a tensor. Clearly, then, 
any orthogonal matrix A, interpreted as rotating a vector (Eq. 4.93), may be 
called a tensor. If, however, we consider the orthogonal matrix as a collection 
of fixed direction cosines, giving the new orientation of a coordinate system, 
there is no tensor transformation involved. 

The symmetry and antisymmetry properties defined earlier are preserved 
under orthogonal similarity transformations. Let A be a symmetric matrix, 


A = > 

and 



A' = BAB -1 . 

(4.99) 

Now 

A' = BAB -1 = B -1 AB = BAB -1 , 

(4.100) 

since 

B is orthogonal. But A = A. Therefore 



A'= BAB -1 = A', 

(4.101) 


showing that the property of symmetry is invariant under an orthogonal 
similarity transformation. In general, symmetry is not preserved under a 
nonnorthogonal similarity transformation. 
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EXERCISES 

Note. Assume all matrix elements are real. 

4.3.1 Show that the product of two orthogonal matrices is orthogonal. 

Note. This is a key step in showing that all n x n orthogonal matrices form a 
group (Section 4.10). 

4.3.2 If A is orthogonal, show that its determinant has unit magnitude. 

4.3.3 If A is orthogonal and det A — +1, show that a i} — C y , where is the cofactor 
of aij. This yields the identities of Eq. 1.41 used in Section 1.4 to show that a 
cross product of vectors (in three-space) is itself a vector. 

Hint. Note Exercise 4.2.32. 

4.3.4 Another set of Euler rotations in common use is 

1. a rotation about the x 3 -axis through an angle q > , counter- 
clockwise, 

2. a rotation about the xi-axis through an angle 0 , counter- 
clockwise, and 

3. a rotation about the x 3 -axis through an angle ip, counter- 
clockwise. 

If 

a = <p — 7 t/2 (p = a + /2 

y — \p + n/2 ip = y — n/2 , 

show that the final systems are identical. 

4.3.5 Suppose the Earth is moved (rotated) so that the north pole goes to 30° north, 
20° west (original latitude and longitude system) and the 10° west meridian 
points due south. 

(a) What are the Euler angles describing this rotation? 

(b) Find the corresponding direction cosines. 

( 0.9551 -0.2552 -0.1504\ 

0.0052 0.5221 -0.8529 ] 

0.2962 0.8138 0.5000/ 

4.3.6 Verify that the Euler angle rotation matrix, Eq. 4.87, is invariant under the 
transformation 

a-MX + 7r, /?->—/?, y ->y — n. 

4.3.7 Show that the Euler angle rotation matrix A(oe,/?,y) satisfies the following 
relations : 

(a) A~'(a,/?,y) = A(a, /?,y) 

(b) A-W,r) = A(-y,— /?,-«). 

4.3.8 Show that the trace of the product of a symmetric and an antisymmetric matrix 
is zero. 

4.3.9 Show that the trace of a matrix remains invariant under similarity transforma- 
tions. 

4.3.1 0 Show that the determinant of a matrix remains invariant under similarity trans- 
formations. 
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Note . These two exercises (4.3.9 and 4.3. 10) show that the trace and the determi- 
nant are independent of the basis. They are characteristics of the matrix (operator) 
itself. 

4.3.1 1 Show that the property of antisymmetry is invariant under orthogonal similarity 
transformations. 

4.3.12 A is 2 x 2 and orthogonal. Find the most general form of 



Compare with two-dimensional rotation. 

4.3.1 3 |x> and | y) are column vectors. Under an orthogonal transformation $, |x'> = 
S|x>, |y'> = S|y>. Show that the scalar product <x|y> is invariant under this 
orthogonal transformation. 

Note. This is equivalent to the invariance of the dot product of two vectors, 
Section 1.3. 

4.3.1 4 Show that the sum of the squares of the elements of a matrix remains invariant 
under orthogonal similarity transformations. 

Note. In Exercise 3.7.11 c 2 B 2 — E 2 may be obtained as the sum of the squares 
of the components of the matrix (tensor) 

4.3.1 5 As a generalization of Exercise 4.3.14, show that 

ls Jk T jk = £ s; m T; m , 

jk l,m 

where the primed and unprimed elements are related by an orthogonal similarity 
transformation. This result is useful in deriving invariants in electromagnetic 
theory (compare Section 3.7). 

Note. This product M jk = Yj^jk^jk ls sometimes called a Hadamard product. 
In the framework of tensor analysis, Chapter 3, this exercise becomes a double 
contraction of two second-rank tensors and therefore is clearly a scalar (in- 
variant) ! 

4.3.16 A rotation q> x 4- <p 2 about the z- axis is carried out as two successive rotations 
<Pj and <p 2 , each about the z-axis. Use the matrix representation of the rotations 
to derive the trigonometric identities : 

cos (<?! + (p 2 ) = cos <p 1 cos (p 2 — sin cp l sin (p 2 
sin^ 4 (p 2 ) = sin (p 1 cos q> 2 4 cos (p { sin tp 2 . 

4.3.17 A column vector V has components V l and V 2 in an initial (unprimed) system. 
Calculate V[ and V 2 for a 

(a) rotation of the coordinates through an angle of 0 counterclockwise , 

(b) rotation of the vector through an angle of 6 clockwise. 

The results for parts (a) and (b) should be identical. 

4.3.18 Write a subroutine that will test whether a real N x N matrix is symmetric. 
Symmetry may be defined as 

0 < | Ctij — ajil < £, 

where s is some small tolerance (which allows for truncation error, and so on 
in the machine). 
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4.4 OBLIQUE COORDINATES 


Throughout this book so far — vector analysis, coordinate systems, tensor 
analysis, and now matrices — we have always taken our coordinates to be 
orthogonal. But sometimes the demands of a physical system force the use of 
a nonorthogonal or oblique system of coordinates. In describing the physical 
properties of a crystal, for example, we might find it more convenient to use 
the coordinate system defined by the axes of this crystal — and these axes are 
often oblique. 

Consider a coordinate system in which the noncoplanar unit vectors a, b, 
and c are not orthogonal. (When we describe a crystal a, b, and c might not 
have unit magnitude either. The interatomic spacings would be more appro- 
priate lengths.) Then an arbitrary vector may be written 

V = \V X + \V y -f- kV z = ar a + hv b + cv c = v. (4. 102) 

V will denote the vector expressed in the usual rectangular cartesian system, 
whereas v is the same vector expressed in the oblique coordinate system. 
Equivalently, we can say that ( V x , V y9 V z ) is the representation in the usual 
cartesian basis, whereas (v a9 v b , v c ) is the representation of the same vector in the 
nonorthogonal basis. 



The special case (really two-dimensional) of j, k, b, c, and V all in the x = 0 
plane is shown in Fig. 4.5. Note carefully that the components v b and v c are 
found by projecting the tip of V parallel to c for v b and parallel to b for v c . The 
general procedure for obtaining one component would be to pass a plane 
through the tip of V parallel to the plane defined by the other two unit vectors. 
With the components defined this way, the sum of the components is just V 
by the triangle or parallelogram laws of vector addition, Section 1.1. 

We proceed from Eq. 4.102 exactly as in Section 4.3 with a instead of f, b 
instead of j' and c in place of k'. From 



OBLIQUE COORDINATES 207 


a = ia x + j a y 4- k a z 

b = i b x + j b y + k b z (4.103) 

c = \C X + )Cy + kc z , 

equating cartesian components, we obtain 

K = ***>« + Mb + 

^ = a y % + M* + C y u c (4. 104) 

K = + Mb + c z r c . 

In matrix form the vector V described by an orthogonal basis is related to 
its description in the oblique basis by 


where 


V = Pv, 



(4.105) 


(4.106) 


The transformation matrix P is not orthogonal, since the column vectors 
forming it, a, b, and c, are not orthogonal. 

Since 


v=P _ 1 V, (4.107) 

we seek P~L The solution is actually developed in Section 1.5. The reciprocal 
lattice vectors 


bxc , , cxa . axb 

, b = , c = — , 

a x b*c axb-c a x b*c 


(4.108) 


taken as row vectors, form a matrix Q, 


Q = 



K 



(4.109) 


It should be emphasized that a', b 7 , and c 7 are not orthogonal. Also, they are 
not of unit length, and if a, b, and c have dimensions, then a', b 7 , and c 7 have 
reciprocal dimensions. If a is a length, a 7 could be a wave number. From the 
properties developed in Section 1.5 

PQ = QP = 1 (4.110) 


or 

Q = P" 1 , P = Or 1 (4.111) 


Exercise 4.4.1 outlines a slightly different, but equivalent, derivation of Q. 



208 DETERMINANTS, MATRICES, AND GROUP THEORY 


From Eqs. 4. 107 and 4. 1 1 1 


v= QV. 

(4.112) 

Taking the transpose of Eqs. 4.105 and 4.1 12, we have 

<F| = <r|P, <t>| = < F|Q 

(4.113) 

< | denoting a row vector — as in Section 4.2. 

V may be resolved in the a', b', c'-space (the reciprocal lattice) exactly as in 

the a, b, c-space. From the primed analog of Eqs. 4.102 to 4.104 

V = Qv' v' = PV, 

(4.114) 

and 

<F| = <a'|Q <»'| = <K|P. 

(4.115) 

The scalar product of two vectors U and V becomes 

<l/||F> = <[/|PQ|K> = < M '||r> 

(4.116) 


from Eqs. 4.112 and 4.115. | > denotes a column vector. The square of a 

vector in oblique coordinates is not the sum of the squares of the components, 
but rather, the sum of the products of an oblique component and the corre- 
sponding reciprocal lattice component. 

If U and V in Eq. 4.116 are the differential length d R = (dx,dy,dz), then 

ds 2 = (dR\\dR) = (dr\ PP|dr>, (4.117) 

using Eqs. 4.105 and 4.1 13. ds 2 is the square of the distance element; dx is d R 
but resolved in the oblique coordinates. Reference to Eq. 2.4 identifies PP as 
the metric of our oblique coordinates. The metric of the reciprocal lattice is 

qg. 

Further development of vector analysis, particularly of a vector calculus in 
oblique coordinates, is probably best considered a branch of noncartesian 
tensor analysis, Sections 3.8 and 3.9. 

v = (v a ,v b ,v c ) is a contravariant vector in the language of Section 3.1. The 
corresponding covariant components are (v' a ,v' h ,v' c ) in the reciprocal lattice. 
From Eqs. 4.105, 4.112, and 4.114 

\vl> = PP|^> and |u £ > = QO|tO- (4.118) 

The metric PP transforms the contravariant vector into covariant form. Its 
inverse, QQ, transforms the covariant vector into contravariant form. In 
contravariant-covariant tensor notation (Section 3.1) the elements of PP are 
g tj , whereas the elements of QQ are g lj . We have 

(ds) 2 = g ij dx l dx j — g lj dx^Xj 

QijG ik = Sj, 

(covariant) = g {j v* 
v l (contravariant) = g lJ Vj. 
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The reader should note that the distinction between covariant and contra- 
variant forms vanishes when the coordinates are orthogonal (cartesian). 

EXERCISES 

4.4.1 From the result of Exercise 4.2.32 q tj = PjJ\ P|, derive the relation 

, b x c 

a = iT“- 

a x b*c 

4.4.2 The vectors defining a particular system of oblique coordinates are 

a = i, b = j, and c = ( j + k)/^/2. 

(a) Find P, Q, and metric PP. 

(b) If V = i 4- 3j + 2k, find v and v'. Verify that 

<v'\\v>= V 2 . 

4.4.3 Show that 

(a) y ' = a • V 

(b) v a = a' • V. 

Note that the lattice defining vectors a, a', and so on need not have unit magnitude. 

4.4.4 One vector with cartesian components V { and oblique (contravariant) components 
i\ and a second with cartesian components U ( and reciprocal lattice (covariant) 
components u { are transformed by a rotation of the coordinate systems described 
by the (orthogonal) matrix S. By definition of a vector 

|V'> = S|V> and |U'> = S|U>. 

(a) Show that 

|v'> (contravariant) = QSP|v> 

|u'> (co variant) = PSQ|u) 

(b) Show that (u'lv') is an invariant, independent of S. 

4.4.5 Show that the metric for contravariant vectors, {g i} ) ~ PP, is given by 

( a • a a * b a • c\ 
ba b-b be), 
c-a c-b c-c / 

For oblique coordinates all these dot products and therefore all the g { j s are con- 
stants. 


4.5 HERMITIAN MATRICES, UNITARY MATRICES 

Definitions 

Thus far it has generally been assumed that our linear vector space is a real 
space and that the matrix elements (the representations of the linear operators) 
are real. For many calculations in classical physics real matrix elements will 
suffice. However, in quantum mechanics complex variables are unavoidable 
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because of the form of the basic commutation relations (or the form of the 
time -dependent Schrodinger equation). With this in mind, we generalize to 
the case of complex matrix elements. To handle these elements, let us define, 
or label, some new properties. 

1. Complex conjugate, A*, formed by taking the 
complex conjugate (i -> — 0 of each element, where 
i = v^T- 

2. Adjoint, A\ formed by transposing A*, 

A 1 = A* = A*. (4.119) 

3. Hermitian matrix. The matrix A is labeled Hermitian 
(or self-adjoint) if 

A = Af (4.120) 

In quantum mechanics (or matrix mechanics) 
matrices are usually constructed to be Hermitian. 

4. Unitary matrix. Matrix U is labeled unitary if 

If - IT 1 , (4.121) 

which represents a generalization of the concept of 
orthogonal matrix (compare Eq. 4.73). 

If the matrix elements are complex, the physicist is almost always concerned 
with adjoint matrices, Hermitian matrices, and unitary matrices. Unitary 
matrices are especially important in quantum mechanics because they leave 
the length of a (complex) vector unchanged -analogous to the operation of an 
orthogonal matrix on a real vector. It is for this reason that the S matrix of 
scattering theory is a unitary matrix. One important exception to this interest 
in unitary matrices is the group of Lorentz matrices, Sections 3.7 and 4.13. 
Using Minkowski space, we see that these matrices are orthogonal , not unitary. 

If the transforming matrix in a similarity transformation is unitary, the 
transformation is referred to as a unitary transformation, 

A' = UALf (4.122) 

Just as the product of two orthogonal matrices is found to be orthogonal 
(Exercise 4.3.1), so we can show that the product of two unitary matrices is 
unitary. Let If and U 2 be unitary. Then 

i =(u 1 u 2 )(u 1 u 2 r i 

= IflfU^Uf 1 (4.123) 

= U 1 U 2 U t 2 U t 1 , 

using the unitary property. Since the operation of adjoint is the same as trans- 
pose (except for the complex conjugate), 

(U 1 U 2 ) t = u|uj. 


(4.124) 
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(Exercise 4.5.3). Substituting into Eq. 4.123, we have 

1 = (U, U 2 )(U 1 U 2 ) f (4.125) 

Multiplying from the left by (U x U 2 )~\ we obtain 

(ViV 2 r x =(UiU 2 )\ (4.126) 

which shows that the product of two unitary matrices is itself unitary. This is 
one of the steps in demonstrating that the n x n unitary matrices form a group 
(Section 4.10). Other properties and applications of these concepts are included 
in the exercises at the end of this section. 

Pauli Matrices 

Four by four complex matrices have been used extensively in relativistic 
theories of the electron. A convenient starting point for developing the 4x4 
matrices is the set of three 2x2 Pauli matrices 



These were introduced by W. Pauli to describe a particle of spin \ (nonrelativistic 
theory). It can readily be shown that (compare Exercise 4.2.13) the Pauli a’s 
satisfy 

GiGj + Gj g i — 26 ij 1 , anticommutation (4.128) 

G { Gj — iG k , cyclic permutation of indices (4.129) 

(^) 2 = 1. (4.130) 


Dirac Matrices 

In 1927 P. A. M. Dirac extended this formalism. Dirac required a set of 
four anticommuting matrices. The three Pauli matrices plus the unit matrix 
form a complete set ; that is, any constant 2x2 matrix M may be written 

M = Cq 1 + C y G i + c 2 o 2 T c 3 G ^ , (4.131) 

where c 0 , c l9 c 2 , and c 3 are constants. Hence the Pauli 2x2 matrices were 
inadequate; no fourth anticommuting matrix exists. We can show that 3x3 
matrices likewise cannot furnish an anticommuting set of four matrices (Exer- 
cise 4.7.8). 

Turning to 4 x 4 matrices, we can build up a complete set as direct products 1 
of the Pauli matrices and the unit matrix. Let 

Oj. Dirac = 1 ®»i. Paulj (4.132) 

Pj, Dirac = °j, Pauli ® 1 • (4.133) 

For example, 


The direct product A (g) B is defined in Section 4.2. 



212 DETERMINANTS, MATRICES, AND GROUP THEORY 



P i = 




1 0 
0 0 
0 0 
0 1 

0 1 
0 0 
0 0 
1 0 



We can show that these 4x4 matrices satisfy the relations 

a iOj + OjOi = 26^ 1 , anticommutation, (4.134) 

p i p j + pjp l = 2<5 y 1, 

— pjCT; s [cr,,^] = 0, commutation, (4.135) 

and 

(J\ CT j — /<J k , 

cyclic permutation of indices. (4.136) 
PiPj 

It is now possible to set up a matrix multiplication table (Table 4.1). 

Dirac originally chose to use the set of four matrices labeled a i9 a 2 , a 3 , and 
a 4 , where a t = p x Gi and a 4 = p 3 . Today the set labeled y l9 i = 1, 2, 3, 4, 5, is 
in more common use. 

These 4x4 Dirac matrices may be referred to as E y , in which 2 

= Pi^r 

With the understanding that p 0 = a 0 = 1 , the unit matrix, we let the indices / 
and j range from 0 to 3. These 16 matrices E 0 have a number of interesting 
properties : 

1. DetE 0 =+l. 

2 . Efj = 1 . 

3. E u = E Jj ; all are Hermitian and then, by property 2, 
unitary. 

4. Trace (E^) = 0 except for E 00 = 1, in which case 
trace ( E 00 ) = 4. This property is exploited in Exercise 
4.5.23 as the matrix analog of orthogonality. 

5. The 16 E 0 matrices almost form a mathematical 
group. 3 Any two of them multiplied together yield 
a member of the set within a factor of — 1 or ± i. 


^ij Pauli ® &j, Pauli* 

3 The Ey can be modified so that they satisfy the group property exactly, but 
then they are no longer Hermitian and unitary. 
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6. The 16 E^- are linearly independent. No one can be 
written as a linear sum of the other 15. 

7. The 16 E tj form a complete set. Any 4x4 matrix 
(with constant elements) may be written as a linear 
combination of these 16, 

A= t C ij Ey. 

IJ= 0 

where the coefficients c £j - are constants, real or 
complex. 


TABLE 4. 1 Dirac Matrices 


0 -i 

0 

0 

i 0 

0 

0 

0 0 

0 


o 

o 

i 

0 

^2 



O 

o 

0 

— j 

0 0 

i 

0 

0 -/ 

0 

0 

i 0 

0 

0 

°2 






Pi 




P 2 



( 1 0 0 0\ /0 1 0 0\ /0 -i 0 0\ /I 0 0 0\ 

01 0 0 \ / 1 0 0 0 1 | i 0 0 0 | (0-1 001 

0 0-1 of loo 0 - 1 J l 0 0 0 i I lo 0-1 ol 

0 0 0 -1/ Vo 0 -1 0/ Vo 0 -/ 0/ Vo 0 0 1/ 

p 3 ,a 4 , 6 X d 2 <5 3 


Anticommuting Sets 

From these 16 Hermitian matrices we can form six anticommuting sets of 
5 matrices each. Using the labels shown in Table 4.1, we have the following 
sets: 
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1 . 

a i. 

a 2, 

<*3> 

« 4 . 

O 5 

2 . 

Yi, 

y 2 , 

y 3 . 

Ya, 

Ys- 

3. 

* 1 , 

& 2, 

* 3 , 

Pi, 

Pi 

4. 


Yi, 


°2, 

O 3 

5. 

a 2 , 

Y 2, 

*2, 

Ox, 

O 3 

6 . 

o 3 . 

y 3 , 

*3, 

Ox, 

O 2 


(4.137) 


Each E ij (exclusive of the unit matrix) appears in two of the preceding sets. In 
addition to the set of a’s, the set of k’s has been used extensively in relativistic 
quantum theory. 

The largest completely commuting sets of Dirac matrices (including the unit 
matrix) have only four matrices. 

The discussion of orthogonal matrices in Section 4.3 and unitary matrices 
in this section is only a mere beginning. The further extensions are of vital 
concern in modern “elementary” particle physics. With the Pauli and Dirac 
matrices, we can develop spinors for describing electrons, protons, and other 
spin \ particles. The coordinate system rotations lead to D 7 (a, /?, y), the rotation 
group usually represented by matrices in which the elements are functions of 
the Euler angles describing the rotation. The special unitary group SU(3), 
(composed of 3 x 3 unitary matrices with determinant -hi), has been used 
with considerable success to describe mesons and baryons. These extensions 
are considered further in Sections 4.10 to 4.12. 


EXERCISES 

4.5.1 Show that 

det(A*) - (det A)* = det(At). 

4.5.2 Three angular momentum matrices satisfy the basic commutation relation 

(and cyclic permutation of indices). If two of the matrices have real elements, 
show that the elements of the third must be pure imaginary. 

4.5.3 Show that (AB) f = BW. 

4.5.4 Matrix C = S f S. Show that the trace is positive definite unless S is the null 
matrix in which case trace (C) = 0. 

4.5.5 If A and B are Hermitian matrices, show that (AB + BA) and /( AB — BA) 
are also Hermitian. 

4.5.6 Matrix C is not Hermitian. Show that C + C f and i(C — C f ) are Hermitian. 
This means that a non-Hermitian matrix may be resolved into two Hermitian 
parts : 

C = I(C + C») + ^i(C-Ct). 

2 2 i 
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4.5.7 

4.5.8 

4.5.9 

4.5.10 

4.5.11 

4.5.12 


4.5.13 


4.5.14 

4.5.15 

4.5.16 


A and B are two noncommuting Hermitian matrices : 

AB - BA = iC. 

Prove that C is Hermitian. 

Show that a Hermitian matrix remains Hermitian under unitary similarity 
transformations. 


Two matrices A and B are each Hermitian. Find a necessary and sufficient 
condition for their product A B to be Hermitian. 

A NS. [A, B] = 0. 

Show that the reciprocal of a unitary matrix is unitary. 

A particular similarity transformation yields 

A' = UAIT 1 
A f/ — UA t U“ 1 . 

If the adjoint relationship is preserved (A f / = A' f ) and det U = 1, show that U 
must be unitary. 

Two matrices U and H are related by 

U = 

with a real. (The exponential function is defined by a Maclaurin expansion. This 
will be done in Section 4.11.) 

(a) If H is Hermitian, show that U is unitary. 

(b) If U is unitary, show that H is Hermitian. (H is independent of a .) 

Note. With H the Hamiltonian, 

— U(x, r)^(x,0) = exp( — it\-\/h)ij/(x, 0) 

is a solution of the time-dependent Schrodinger equation. U(x, /) = exp ( — it H /h) 
is the “evolution operator.” 

An operator T (t + e, /) describes the change in the wave function from / to 
t + b. For £ real and small enough so that e 2 may be neglected 

T(/ -h M) = 1 — ~eH(f). 

h 

(a) If T is unitary, show that H is Hermitian. 

(b) If H is Hermitian, show that T is unitary. 

Note. When H (t) is independent of time, this relation may be put in exponential 
form — Exercise 4.5.12. 


Show that an alternate form 


T(/ T £, /) — 


1 — kH(t)/2h 
1 + /e H (t)l2h 


agrees with the T of part (a) of Exercise 4.5. 13 neglecting e 2 and is exactly unitary 
(for H Hermitian). 


Prove that the direct product of two unitary matrices is unitary. 

Denoting the 16 Dirac matrices by = a 0 = 1 ), show that 

(a) E fj = 1 for all i and 7 , 

(b) E ij = E J (Hermitian), 

Hint. Use the known properties of p t and Oj. 
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4 . 5.17 Verify Eqs. 4.134 to 4.136 for the 4 x 4 a and p matrices. 


4 . 5.18 Using Eqs. 4.135 and 4.136, show that each of the six sets of Dirac matrices 
listed in Eq. 4.137 is actually an anticommuting set. 

4 . 5.19 Using Eqs. 4.135 and 4.136, show that 

(a) a x a 2 a 2 a A a 5 = +1, 

(b) YiYiYiYaYs = +1. 

4 . 5.20 If M = ^(1 4- y 5 ), show that 

M 2 = M. 

Note that y 5 may be replaced by any other Dirac matrix (any of Table 4.1). 
If M is Hermitian, then this result, M 2 = M, is the defining equation for a quan- 
tum mechanical projection operator. 

4 . 5.21 Show that 

a x a = 2/a, 

where a is a vector whose components are the a matrices, 

a = (<*1 , a 2 , a 3 ). 

Note that if a is a polar vector (Section 3.4), then a is an axial vector. 

4 . 5.22 Prove that the 16 Dirac matrices form a linearly independent set. 

Hint. Assume the contrary. Let E mn be a linear combination of the other E y ’s. 
Multiply by E m „. Take the trace and show that a contradiction results. 


4 . 5.23 (a) If we assume that a given 4x4 matrix, A (with constant elements), can 
be written as a linear combination of the 16 Dirac matrices 


show that 


3 



ij=0 


fm. = i: trace(AE mn ). 


(b) If A has one and only one nonvanishing element, show that there will be 
exactly four nonvanishing coefficients in its expansion. 

(c) Expand 


A = I 


"10 0 0 
0 0 0 0 

0 0 0 0 

,0 0 0 0 . 


in terms of the E^. 


ANS . A = i(E 00 + E 03 + E 30 + E 33 ) 
= i(1 + CJt, -f p 3 + ^ 3 ). 


4 . 5.24 If A is any one of the Dirac matrices (excluding the unit matrix), it will commute 
with eight of the Dirac matrices and anticommute with the other eight. List 
the eight matrices that anticommute with y x . 

ANS. o 2 , <J 3 , p l5 a iy y 2 , K 3 > P 3 > <*i- 


4 . 5.25 For investigating questions of covariance under Lorentz transformations, one 
usually expresses the Dirac electron theory in terms of y M , p = 1,2, 3, 4. Show 
that these four matrices together with their products 
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(a) Y„Yv, H + v 

(b) y m YvYa> indices all different 
(C) Y1Y2Y3Y4 

and the unit matrix 1 reproduce all 16 Dirac matrices (apart from constant fac- 
tors). 

Note. In beta decay theory t is used to describe a scalar interaction, the four 
Y/s a vector interaction, the six double products (y t y )) a tensor interaction, the 
four triple products (YiYjY/t) an axial vector interaction, and the product Ys = 
Y 1 Y 2 Y 3 Y 4 a pseudoscalar interaction. Experiment shows the actual interaction 
is a linear combination of vector and axial vector, not conserving parity. 

4.5.26 (a) Givenr'^ Ur, with U a unitary matrix and r a (column) vector with complex 

elements, show that the norm (magnitude) of r is invariant under this opera- 
tion. 

(b) The matrix U transforms any column vector r with complex elements into 
r' leaving the magnitude invariant: r f r = r'V. Show that U is unitary. 

4.5.27 Write a subroutine that will test whether a complex N x N matrix is self-adjoint. 
In demanding equality of matrix elements a (j ~ ajj, allow some small tolerance 
e to compensate for truncation error, and so on in the machine. 

4.5.28 Write a subroutine that will form the adjoint of a complex M x N matrix. 

4.5.29 (a) Write a subroutine that will take a complex M x N matrix A and will yield 

the product A f A. 

Hint. This subroutine can call the subroutines of Exercise 4.2.41 and 4.5.28. 
(b) Test your subroutine by taking A to be one or more of the Dirac matrices, 
Table 4.1. 


4.6 DIAGONALIZATION OF MATRICES 

Moment of Inertia Matrix 

In many physical problems involving matrices it is desirable to carry out a 
(real) orthogonal similarity transformation or a unitary transformation to 
reduce the matrix to a diagonal form, nondiagonal elements all equal to zero. 
One particularly direct example of this is the moment of inertia matrix I of a 
rigid body. From the definition of angular momentum L we have 

L = !to (4.138) 

co being the angular velocity. 1 The inertia matrix I is found to have diagonal 
components 

I xx — ~ x f )> atl so on ’ (4.139) 

1 

the subscript i referring to mass located at r* = (x^y^zj. For the nondiagonal 
components we have the products of inertia. 


J The moment of inertia matrix may also be developed from the kinetic 
energy of a rotating body, T = ^<co| l|<o>. 
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P3 



FIG. 4.6 Moment of inertia ellipsoid 


hy = - Z = ! yx • (4. 140) 

i 

By inspection, matrix I is symmetric. Also, since I appears in a physical equation 
of the form (4.138), which holds for all orientations of the coordinate system, 
it may be considered to be a tensor (quotient rule, Section 3.3). 

The problem now is to orient the coordinate axes in space so that the I xy and 
the other nondiagonal elements will vanish. As a consequence of this orientation 
and an indication of it, if the angular velocity is along one such realigned axis, 
the angular velocity and the angular momentum will be parallel. 

Geometrical Picture — Ellipsoid 

It is perhaps instructive to consider a geometrical picture of this problem. 
If the inertia matrix i is multiplied from each side by a unit vector of variable 
direction, n = (a, /?, y ), 

<«|l|/i> = I, (4.141) 

where I is a number (scalar) whose magnitude depends on the choice of direction 
of n. Carrying out the multiplication, we obtain 

I = I xx ct 2 + IJi 2 + 4>’ 2 + 2 I x ,«p + 2 I„ay + 2[ y Jy. (4. 142) 

To throw this into one of the standard forms for an ellipsoid, we introduce 


n 



(4.143) 
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in which p is variable in direction and magnitude. Equation 4.142 becomes 

1 = IxxPi + lyypl + hzpl + 2 I xy PiP 2 + 2 I xz PiPi + 2I yz p 2Pi . (4.144) 

This is the general form of an ellipsoid relative to the coordinates p x , p 2 , P 3 . 
However, from analytic geometry it is known that the coordinate axes can 
always be rotated to coincide with the axes of our ellipsoid. Then 

1 =IiP'i 2 + I 2 P'2 2 + hPz 2 , ( 4 - 145 ) 

in which p { , p' 2 , p 2 is the new set of coordinates. 

Principal Axes 

In many elementary cases, especially when symmetry is present, these new 
axes, called the principal axes , can be found by inspection. We now proceed to 
develop a general method of finding the diagonal elements and the principal 
axes. 


Hermitian Matrices 

First, let us examine an important theorem about the diagonal elements and 
the principal axes. In the equation 

A|r> = A|r> (4.146) 

A, a number (scalar), is known as the eigenvalue, |r> the corresponding vector, 
is the eigenvector . 2 The terms were introduced from the early German literature 
on quantum mechanics. We now show that if A is a Hermitian matrix , 3 its 
eigenvalues are real and its eigenvectors orthogonal. 

Let k t and kj be two eigenvalues and |r £ > and |r^>, the corresponding eigen- 
vectors of A, a Hermitian matrix. Then 

A(r £ > = A £ |r £ > (4.147) 

A|r,) = A J .|r j > (4.148) 

Equation 4.147 is multiplied by <r ; | 

<r J .|A|r i >=A f <r / |r,> (4.149) 

Equation 4.148 is multiplied by <r,| to give 

<ri|A|r,.> = (4.150) 

Taking the adjoint* of this equation, we have 

■<r j |A t |r ( ) = A;<r J |r l > (4.151) 


2 Equation 4.138 will take on this form when to is along one of the principal 
axes. Then L — Aw and lo> = Aco. In the mathematics literature k is usually 
called a characteristic value , to a characteristic vector. 

3 If A is real, the Hermitian requirement is replaced by a requirement of 
symmetry. 

*Note <ry| = |r j > t 
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or 


<r,.|A|r,.> = A*<r,.|r,.> (4.152) 

since A is Hermitian. Subtracting Eq. 4.152 from Eq. 4.149, we obtain 

(A f -^)<r,|r f > = 0. (4.153) 

This is a general result for all possible combinations of i and j. First, let j = i. 
Then Eq. 4.153 becomes 

{li — A?) <r f |r £ > = 0. (4.154) 

Since = 0 would be a trivial solution of Eq. 4.154, we conclude that 

K = Xf, (4.155) 


or X { is real, for all i. 

Second, for i i= j , and X { =/= 

a i -A J .)<r,|r i > = 0 


(4.156) 


or 

<r,| ri > = 0, (4.157) 

which means that the eigenvectors of distinct eigenvalues are orthogonal, Eq. 
4. 157 being our generalization of orthogonality in this complex space. 4 

If Xi — Xj i (degenerate case), |r f > is not automatically orthogonal to |i)>, but 
it may be made orthogonal. 5 Consider the physical problem of the moment of 
inertia matrix again. If .v, is an axis of rotational symmetry, then we will find 
that X 2 — X 3 . Eigenvectors |r 2 > and |r 3 > are each perpendicular to the symmetry 
axis, |r 1 >, but they lie anywhere in the plane perpendicular to |r t > ; that is, any 
linear combination of |r 2 > and |r 3 > is also an eigenvector. Consider (u 2 |r 2 ) + 
a 3 |r 3 » with a 2 and a 3 constants. Then 

A(« 2 |r 2 > + a 3 |r 3 » = a 2 A 2 |r 2 > + a 3 /l 3 |r 3 > 

(4. 1 jo) 

= X 2 (a 2 \r 2 } +a 3 |r 3 », 

as is to be expected, for Xj is an axis of rotational symmetry. Therefore, if 
|r,> and |r 2 > are fixed, |r 3 > may simply be chosen to lie in the plane perpendic- 
ular to |r x > and also perpendicular to |r 2 >. A general method of orthogonalizing 
solutions, the Gram -Schmidt process, is applied to functions in Section 9.3. 
The set of n orthogonal eigenvectors of our n x n Hermitian matrix forms a 


4 The corresponding theory for differential operators (Sturm-Liouville 
theory) appears in Section 9.2. The integral equation analog (Hilbert- 
Schmidt theory) is given in Section 16.4. 

5 We are assuming here that the eigenvectors of the n-fold degenerate k t span 
the corresponding ^-dimensional space. This may be shown by including a 
parameter e in the original matrix to remove the degeneracy and then letting 
£ approach zero (compare Exercise 4.6.30). This is analogous to breaking a 
degeneracy in atomic spectroscopy by applying an external magnetic field 
(Zeeman effect). 
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complete set, spanning the ^-dimensional (complex) space. This fact is useful 
in a variational calculation of the eigenvalues, Section 17.8 (Exercise 4.7.19). 
Eigenvalues and eigenvectors are not limited to Hermitian matrices. All 
matrices have eigenvalues and eigenvectors. For instance, the stochastic 
population matrix T satisfies an eigenvalue equation 


T P equilibrium 


equilibrium 5 


with A = 1. However, only Hermitian matrices have all eigenvectors orthogonal 
and all eigenvalues real. 


Antihermitian Matrices 

Occasionally, in quantum theory we encounter antihermitian matrices: 

A f = -A. 

Following the analysis of the first portion of this section, we can show that 

a. The eigenvalues are pure imaginary (or zero). 

b. The eigenvectors corresponding to distinct eigen- 
values are orthogonal. 


The matrix R formed from the normalized eigenvectors is unitary. This anti- 
hermitian property is preserved under unitary transformations. 

Secular Equation 

The preceding demonstration of real eigenvalues and orthogonal eigen- 
vectors is essentially an existence theorem. To determine the eigenvalues A f 
and the eigenvectors |r f > actually we return to Eq. 4.146. Assuming |r> to be 
multiplied by the unit matrix, we may rewrite Eq. 4.146 

(A - A1)|r> = 0, (4.159) 

in which 1 is the unit matrix. This is a set of simultaneous, homogeneous, 
linear equations. By Section 4.1 it has nontrivial solutions only if the deter- 
minant of the coefficients vanishes, 

| A — A1 1 = 0. (4.160) 

Let us consider the case in which A is a 3 x 3 Hermitian matrix. Then 


a n A 

<*12 

# is 



#21 

a 2 2 A 

#23 

= 0 . 

(4.161) 

#31 

#32 

#33 ~ ^ 




Because of its applications in astronomical theories Eq. 4.161 is usually called 
the secular equation . 6 Equation 4.161 yields a cubic equation in A, which, of 
course, has three roots. 7 By Eq. 4.155 we know that these roots are real. 


6 This equation also appears in second-order perturbation theory in quantum 
mechanics. 

7 See Exercise 6.4.9. 
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Substituting one root at a time back into Eq. 4. 159, we can find the correspond- 
ing eigenvectors. 

EXAMPLE 4.6.1 Eigenvalues and Eigenvectors of a Symmetric Matrix 


Let 


The secular equation is 


or 




-2 1 

1 -2 

0 0 


0 

0 

-2 


= 0 , 


— 2(2 2 - 1 ) = 0 , 


(4.162) 

(4.163) 

(4.164) 


expanding by minors. The roots are 2= —1, 0, 1. To find the eigenvector 
corresponding to 2 = — 1, we substitute this value back into the eigenvalue 
equation Eq. 4.159 



With 2 = — 1 , this yields 

x + y = 0, 
z = 0. 


(4.165) 


(4.166) 


Within an arbitrary scale factor, and an arbitrary sign (or phase factor), 
|r!> = (1, — 1,0). Note carefully that (for real |r> in ordinary space) the eigen- 
vector singles out a line in space. The positive or negative sense is not deter- 
mined. This indeterminancy could be expected if we noted that Eq. 4.159 is 
homogeneous in |r>. For convenience we will require that the eigenvectors be 
normalized to unity, (rjr!) = 1. With this choice of sign, 

!'■> “ '■-(tvtH (4 ' ,67) 

is fixed. For 2 = 0, Eq. 4.159 yields 


|r 2 > or r 2 = (0,0, 1) is a suitable eigenvector. Finally, for 2 = 1, we get 

— x + y = 0, 
z = 0, 


(4.169) 
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or 

'•> or "-(ji’jl- 0 )- <4 ' l70) 

The orthogonality of r, , r 2 , and r 3 , corresponding to three distinct eigenvalues, 
may be easily verified. 


EXAMPLE 4.6.2 Degenerate Eigenvalues 


Consider 


1 0 0 X 

A = ( 0 0 1 |. 

,0 1 0 , 


The secular equation is 


1-20 0 
0 -2 1 

0 1 -2 


= 0 


or 


(1 - 2 )( 2 2 - 1) = 0, 2 = -1,1,1, 
a degenerate case. If 2 = — 1, the eigenvalue equation (4.159) yields 

2jc = 0, 
y + z = 0. 

A suitable normalized eigenvector is 


For 2 = 1, we get 


|ri> or ri = (°’7!’7f *• 

— y + z = 0 


(4.171) 

(4.172) 

(4.173) 

(4.174) 

(4.175) 

(4.176) 


and no further information. We have an infinite number of choices. Suppose, 
as one possible choice, r 2 is taken as 

|rj> or r = = (°-75’75)- <4 ' 177) 

which clearly satisfies Eq. 4.176. Then r 3 must be perpendicular to r t and may 
be made perpendicular to r 2 by 8 

r 3 = r t x r 2 = (1,0,0). (4.178) 


8 The use of the cross product is limited to three-dimensional space (see 
Section 1.4). 
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DiagonaliZation 

The equations, developed for our existence theorem at the beginning of this 
section, can be used to form a transformation matrix that will convert the 
Hermitian matrix A into diagonal form. Let R be a matrix formed from the 
three orthonormal column vectors jrj >, lr 2 ), and |r 3 ) in any desired order. 


( *i *2 *s\ 

Ji y 2 y 3 J. 

Z 1 Z 2 Z 3 J 

in which each column {x t ,y t ,Zj} is an eigenvector r, . Since 

< r i| r j> = S tJ 


(4.179) 


(4.180) 


R is unitary (or simply orthogonal if A, and therefore r, are real). Then, forming 
R^AR, we have 


< r i| 


R + AR = 


<*2 


ki> l r 2> 



< r 3 


< r l 

< r 2 

< r 3 



A 2 |r 2 > 



X o o\ 
o X o) 

,0 0 x) 


(4.181) 


Hence R f AR is a diagonal matrix with eigenvalues X h the order of the eigen- 
values corresponding to the order of the column vectors or |r f ) in R. To 
develop the geometrical picture, consider A, a real (symmetric) matrix with 
real eigenvalues and real eigenvectors. Matrix R corresponds to B~ l in Eq. 
4.95 or better, R corresponds to B, R being composed of <r x | and so on, the 
eigenvectors r t written as row vectors. 


( ( r l| \ ( X \ Tl Z l\ /^ll ^12 b i3 \ 

^ r 2 1 I = I x 2 yi Z 2 I = ( ^21 ^22 ^23 1* (4.182) 

< r 3 | / \*3 W \*3I b 3 2 b 3 3 / 

Now the row (b il9 b i29 b i2 ) 9 which defines a unit vector r, in relation to the 
original coordinate system, specifies the three direction cosines of r f with the 
original axes. Remembering that matrix B rotates the coordinate system into 
a new system in which (here) A is diagonal, we see that this new system is 
specified by the three eigenvectors r t = (x h y i9 Zi). They are the unit vectors 
along the principal axes, the axes in relation to which A is diagonal. 

The preceding analysis has the advantage of exhibiting and clarifying 
conceptual relationships in the diagonalization of matrices. However, for 
matrices larger than 3 x 3, or perhaps 4x4, the process rapidly becomes so 
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cumbersome that we turn gratefully to high-speed computers and iterative 
techniques . 9 One such technique is the Jacobi method for determining eigen- 
values and eigenvectors of real symmetric matrices. This Jacobi technique for 
determining eigenvalues and eigenvectors and the Gauss-Seidel method of 
solving systems of simultaneous linear equations are examples of relaxation 
methods. They are iterative techniques in which hopefully the errors will 
decrease or relax as the iterations continue. Relaxation methods are used 
extensively for the solution of partial differential equations. 


EXERCISES 

4.6.1 (a) Starting with the angular momentum of the / th element of mass, 

L ( = r, x p- = mpi x (to x r f ), 

derive the inertia matrix such that L = Ico, |L> = l|co>. 

(b) Repeat the derivation starting with kinetic energy 

T i = x r,) 2 . (T = ^<<o|!|c*>>) 

4.6.2 Show that the eigenvalues of a matrix are unaltered if the matrix is transformed 
by a similarity transformation. This property is not limited to symmetric or 
Hermitian matrices. It holds for any matrix satisfying the eigenvalue equation, 
Eq. 4.159. If our matrix can be brought into diagonal form by a similarity 
transformation, then two immediate consequences are 

1 . The trace (sum of eigenvalues) is invariant under a similarity 
transformation. 

2. The determinant (product of eigenvalues) is invariant under a 
similarity transformation. 

Note . Prove this separately (for matrices that cannot be diagonalized). The 
invariance of the trace and determinant are often demonstrated by using the 
Cayley-Hamilton theorem: A matrix satisfies its own characteristic (secular) 
equation. 

4.6.3 As a converse of the theorem that Hermitian matrices have real eigenvalues and 
that eigenvectors corresponding to distinct eigenvalues are orthogonal, show 
that if 

(a) the eigenvalues of a matrix are real and 

(b) the eigenvectors satisfy Eq. 4. 180, rjr y = 6 U or ((r^r^) = ^), then the matrix 
is Hermitian. 

4.6.4 Show that a real matrix that is not symmetric cannot be diagonalized by an 
orthogonal similarity transformation. 

Hint . Assume that the nonsymmetric real matrix can be diagonalized and develop 
a contradiction. 

4.6.5 The matrices representing the angular momentum components J x , J y , and J z 
are all Hermitian. Show that the eigenvalues of J 2 where J 2 — J 2 4- J 2 + J 2 
are real and nonnegative. 


9 In higher-dimensional systems the secular equation may be strongly ill- 
conditioned with respect to the determination of its roots (the eigenvalues). 
Direct solution by machine may be very inaccurate. Iterative techniques for 
diagonalizing the original matrix are usually preferred. 
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4.6.6 A has eigenvalues and corresponding eigenvectors |x,->. Show that A 1 has the 
same eigenvectors but with eigenvalues 

4.6.7 A square matrix with zero determinant is labeled singular. 

(a) If A is singular, show that there is at least one nonzero column vector v 
such that 

A|v> = 0. 

(b) If there is a nonzero vector |v> such that 

A|v> = 0, 

show that A is a singular matrix. This means that if a matrix (or operator) 
has zero as an eigenvalue, the matrix (or operator) has no inverse. 

4.6.8 The same similarity transformation diagonalizes each of two matrices. Show 
that the original matrices must commute. (This is particularly important in the 
matrix (Heisenberg) formulation of quantum mechanics.) 

4.6.9 Two Hermitian matrices A and B have the same eigenvalues. Show that A and 
B are related by a unitary similarity transformation. 

4.6.10 Find the eigenvalues and an orthonormal (orthogonal and normalized) set of 
eigenvectors for the matrices of Exercise 4.2.15. 

4.6.1 1 Show that the inertia matrix for a single particle of mass m at (x,y\z) has a zero 
determinant. Explain this result in terms of the invariance of the determinant 
of a matrix under similarity transformations (Exercise 4.3.10) and a possible 
rotation of the coordinate system. 

4.6.12 A certain rigid body may be represented by three point masses : 


m x 

= 1 

at 

(U.-2) 

m 2 

= 2 

at 

(-1, -1,0) 

m 3 

= 1 

at 

(1, 1.2). 


(a) Find the inertia matrix. 

(b) Diagonalize the inertia matrix obtaining the eigenvalues and the principal 
axes (as orthonormal eigenvectors). 

4.6.13 z 



Unit masses are placed as shown in the figure. 
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(a) Find the moment of inertia matrix. 

(b) Find the eigenvalues and a set of orthonormal eigenvectors. 

(c) Explain the degeneracy in terms of the symmetry of the system. 

/ 4 -1 -1\ ^ = 2 

ANS. 1 = 1—1 4 -1 r 1 = (1/^3, 1/^3, 1/V3) 

\-l -1 4/ A 2 = A 3 -5. 

4.6.1 4 A mass m x =\ kg is located at (1, 1, 1) (meters), a mass m 2 = \ kg is at (— 1, — 1, 
— 1). The two masses are held together by an ideal (weightless, rigid) rod. 

(a) Find the moment of inertia tensor of this pair of masses. 

(b) Find the eigenvalues and eigenvectors of this inertia matrix. 

(c) Explain the meaning, the physical significance of the X = 0 eigenvalue. 
What is the significance of the corresponding eigenvector? 

(d) Now that you have solved this problem by rather sophisticated matrix- 
tensor techniques, explain how you could obtain 

(1) X — 0 and X = ? — by inspection. 

(2) r A=0 = ? — by inspection. 

(By inspection means using freshman physics.) 

4.6.1 5 Unit masses are at the eight corners of a cube (± 1, ± 1, ± 1). Find the moment 
of inertia matrix and show that there is a triple degeneracy. This means that so 
far as moments of inertia are concerned, the cubic structure exhibits spherical 
symmetry. 


4.6.16 Find the eigenvalues and corresponding orthonormal eigenvectors of the fol- 
lowing matrices (as a numerical check, note that the sum of the eigenvalues equals 
the sum of the diagonal elements of the original matrix — Exercise 4.3.9). Note 
also the correspondence between det A = 0 and the existence of X = 0 — as 
required by Exercise 4.6.2 and 4.6.7. 


4.6.17 


4.6.18 


4.6.19 


4.6.20 


4.6.21 




/ 1 V2 0\ 

A = | s[2 0 0 I. 

\ 0 0 0 / 




/ 1 V8 0\ 

A = I yjs 1 V8 . 

\ 0 V8 1/ 


( X ° 
A = I 0 1 

\0 1 



A = 




ANS . X = 0,1,2. 


ANS . X= -1,0,2. 


ANS. X= -1,1,2. 


ANS. X =-3,1,5. 


ANS. X — 0, 1,2. 


ANS. X= -1,1,2. 
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4 . 6.22 


4 . 6.23 


4 . 6.24 


4 . 6.25 


4 . 6.26 


4 . 6.27 


4 . 6.28 


4 . 6.29 




ANS. X ~ — ^2, 0, y/2. 


ANS . X = 0, 2, 2. 


ANS . X— — 1, —1,2. 


ANS. X= -1,2,2. 


/fiVS. A -0,0, 3. 


ylTVS. A =1,1, 6. 


ylJVS. A — 0, 0, 2. 


A = 2, 3, 6. 


4 . 6.30 (a) Determine the eigenvalues and eigenvectors of 

C 0- 

Note that the eigenvalues are degenerate for £ — 0 but the eigenvectors are 
orthogonal for all £ ^ 0 and 8 -► 0. 

(b) Determine the eigenvalues and eigenvectors of 

0 0 

Note that the eigenvalues are degenerate for e — 0 and for this (non sym- 
metric) matrix the eigenvectors (£ = 0) do not span the space. 

(c) Find the cosine of the angle between the two eigenvectors as a function of 
e for 0 < e < 1. 

4 . 6.31 (a) Take the coefficients of the simultaneous linear equations of Exercise 4.1.7 

to be the matrix elements a tj of matrix A (symmetric). Calculate the eigen- 
values and eigenvectors. 
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(b) 


Form a matrix R whose columns are the eigenvectors of A and calculate 
the triple matrix product RAR. 

ANS. A =3.33163 


4 . 6.32 Repeat Exercise 4.6.31 by using the matrix of Exercise 4.2.39. 


4.7 EIGENVECTORS, EIGENVALUES 

In Section 4.6 we concentrate primarily on Hermitian or real symmetric 
matrices and on the actual process of finding the eigenvalues and eigenvectors. 
In this section we generalize to normal matrices with Hermitian and unitary 
matrices as special cases. The physically important problem of normal modes 
of vibration and the numerically important problem of ill-conditioned matrices 
are also considered. 


Normal Matrices 1 

A normal matrix is a matrix that commutes with its adjoint, 

[A,At] = 0. 

Obvious and important examples are Hermitian and unitary matrices. We will 
show that normal matrices have orthogonal eigenvectors (see Table 4.2). We 
proceed in two steps. 

I. Let A have an eigenvector |x> and corresponding eigenvalue A. Then 

A|x> = 2|x> (4.183) 

or 

(A — 21)|x> = 0. (4.184) 

For convenience the combination A — 21 will be labeled B. Taking the adjoint 
of Eq. 4.184, we obtain 

<x|(A -21) f = 0 = <x|Bf (4.185) 

Because 

[(A — 21 ) f , (A - 21)] = [A, A f ] = 0, 

we have 

[B, B + ] = 0. (4.186) 

The matrix B is also normal. 


1 Normal matrices are the largest class of matrices that can be diagonalized 
by unitary transformations. For an extensive discussion of normal matrices, 
see “Normal matrices for physicists,'' P. A. Macklin, Am. J. Phys. 52: 513 
(1984). 
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From Eqs. 4. 1 84 and 4. 1 85 we form 

<x| B^lx) = 0. (4.187) 

This equals 

<x|BB f |x> = 0 (4.188) 

by Eq. 4.186. Now Eq. 4.188 may be rewritten as 

(B t |x>) t (B t |x>) = 0. (4.189) 

Thus 

B f |x> = (A f - A*1)|x> = 0. (4.190) 


We see that for normal matrices, A f has the same eigenvectors as A but the 
complex conjugate eigenvalues. 

II. Now, considering more than one eigenvector-eigenvalue, we have 


A|x,-> = Aj|Xj> 

(4.191) 

A |Xj> = A,|x ; >. 

Multiplying Eq. 4.192 from the left by <x,| yields 

(4.192) 

< x i|A|x;> = / 7 <x,|x 7 ->. 

Operating on the left side of Eq. 4.193, we obtain 

(4.193) 

< x ,| A = (A t |x i » t 

(4.194) 

From Eq. 4.190 with A f having the same eigenvectors as A but the complex 
conjugate eigenvalues 

(A t |x i » t = Ur|x i » t = A ( <x,|. 

Substituting into Eq. 4.193, we have 

= ^<x f |x 7 -> 

or 

(4.195) 

(A; - /jXx.lXj) = 0 

(4.196) 


This is the same as Eq. 4.156. 

For h =/= kj 


<x i |x i > = 0. 

The eigenvectors corresponding to different eigenvalues of a normal matrix 
are orthogonal. This means that a normal matrix may be diagonalized by a 
unitary transformation. The required unitary matrix may be constructed from 
the orthonormal eigenvectors as shown earlier in Section 4.6. 

The converse of this result is also valid. If A can be diagonalized by a unitary 
transformation, then A is normal. 
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TABLE 4.2 


Matrix Eigenvalues 

Hermitian Real 

Antihermitian Pure imaginary (or zero) 

Unitary Unit magnitude 

Normal If A has eigenvalue k 

has eigenvalue 2*. 


Eigenvectors 

(for different eigenvalues) 

Orthogonal 

Orthogonal 

Orthogonal 

Orthogonal 

A and At have the same eigenvectors. 


Normal Modes of Vibration 

We consider the vibrations of a classical model of the C0 2 molecule. It is 
an illustration of the application of matrix techniques to a problem that does 
not start as a matrix problem. It also provides an example of the eigenvalues 
and eigenvectors of an asymmetric real matrix. 

EXAMPLE 4.7. 1 Normal Modes 

Consider three masses on the x-axis joined by springs as shown in Fig. 4.7. 
The spring forces are assumed to be linear (small displacements, Hooke’s law) 
and the mass is constrained to stay on the x-axis. 



k 


k 


M 


m 

A_OfULr 

M 



FIG. 4.7 


Using a different coordinate for each mass Newton’s second law yields the 
set of equations 


h = ~ x 2> 


X 2 = -~(x 2 - X r ) - L* 2 - x 3 ) 
m m 


(4.197) 


*3 = -Jf( x 3 - * 2 )- 


The system of masses is vibrating. We seek the common frequencies, cu, such 
that all masses vibrate at this same frequency. These are the normal modes. Let 

*i = *i oe imt , i= 1,2,3. 

Substituting into Eq. 4.197, we may rewrite this set as 
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A _A 0 \ 

M M 
k 2k k 
mm m 
^ k k I 


-hco 


(4.198) 


0 


M Mi 


with the common factor e i<ot divided out. We have a matrix-eigenvalue equation 
with the matrix asymmetric. The secular equation is 


(4.199) 


This leads to 


The eigenvalues are 


k 


~M 


k 2 

— or 

i 

- 

k 

k 

~M 

M 

)h- 

2k 

m 


CD 2 = 0 , 


Jc_ 2k 
M + m’ 


all real. 

The corresponding eigenvectors are determined by substituting the eigen- 
values back into Eq. 4.198 one eigenvalue at a time. For co 2 = 0 Eq. 4.198 
yields 

— x 2 = 0 

— x x 4- lx 2 — x 3 = 0 

— i 2 +x 3 = 0. 

Then, we get 

*1 “ x 3* 

This describes pure translation, no relative motion of the masses, no vibration. 
For co 2 = k/M Eq. 4.198 yields 


*i = - v. 


x 2 = 0. 


The two outer masses are moving in opposite direction. The center mass is 
stationary. 

For co 2 = k/M + 2k jm the eigenvector components are 


*1 = * 3 > 


m 
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The two outer masses are moving together. The center mass is moving opposite 
to the two outer ones. The net momentum is zero. 

Any displacement of the three masses along the x-axis can be described as 
a linear combination of these three types of motion : translation plus two forms 
of vibration. 


Ill-conditioned Systems 

A system of simultaneous linear equations may be written as 

A|x> = |y> or A" 1 |y> = |x>, (4.200) 

with A and |y> known and |x> unknown. The reader may encounter examples 
in which a small error in |y> results in a larger error in |x>. In this case the 
matrix A is called “ill-conditioned.” With |<5x> an error in |x> and |<5y> an 
error in |y>, the relative errors may be written as 

<<5x|<5x> 

<x|x> 


Here A^(A), a property of matrix A, is labeled the condition number. For A 
Hermitian one form of the condition number is given by 1 


1^1 min 

An approximate form due to Turing 2 is 

(4.202) 

*(A 

(4.203) 


in which n is the order of the matrix and [^y] max is the maximum element in A. 
EXAMPLE 4.7.2 An Ill-conditioned Matrix 


1/2 


<K(A) 


<fol<5y) 

- <y|y> . 


1/2 


(4.201) 


A common example of an ill-conditioned matrix is the Hilbert matrix, 
= (i +j — l)" 1 . The Hilbert matrix of order 4, H 4 , is encountered in a least 
squares fit of data to a third-degree polynomial. We have 


h 4 = 



1 I 

2 3 

1 1 

3 4 

1 1 

4 5 

i 1 

5 6 



(4.204) 


The elements of the inverse matrix (order n) are given by 


fH -i, (-l) i+j . (» + i- !)!(»+;- 1)! 

w i+j- 1 [0 - I)!(y - 1)!] 2 (« - OK* -j )\ ' 


(4.205) 


Forsythe, George E., and Cleve B. Moler, Computer Solution of Linear 
Algebraic Equations. 

2 Compare Todd, John, The Condition of the Finite Segments of the Hilbert 
Matrix , in the National Bureau of Standards’ Applied Mathematics Series 
# 313 . 
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For n — 4 


HI 1 = 


' 16 

-120 

240 

-140' 

-120 

1200 

-2700 

1680 

240 

-2700 

6480 

-4200 

.-140 

1680 

-4200 

2800. 


(4.206) 


From Eq. 4.203 the Turing estimate of the condition number for H 4 becomes 

^Turing = 4 X 1 X 6480 

= 2.59 x 10 4 . 


This is a warning that an input error may be multiplied by 25,000 in the 
calculation of the output result. It is a statement that H 4 is ill-conditioned. 
If you encounter a highly ill-conditioned system you have two alternatives 
(besides abandoning the problem). 

a. Try a different mathematical attack. 

b. Arrange to carry more significant figures and push 
through by brute force. 


As previously seen, matrix eigenvector-eigenvalue techniques are not limited 
to the solution of strictly matrix problems. A further example of the transfer of 
techniques from one area to another is seen in the application of matrix tech- 
niques to the solution of Fredholm eigenvalue integral equations. Section 16.3. 
In turn, these matrix techniques are strengthened by a variational calculation 
of Section 17.8. 


EXERCISES 

4.7.1 Show that every 2x2 matrix has two eigenvectors and corresponding eigen- 
values. The eigenvectors are not necessarily orthogonal. The eigenvalues are not 
necessarily real. 

4.7.2 As an illustration of Exercise 4.7.1, find the eigenvalues and corresponding 
eigenvectors for 

C 

Note that the eigenvectors are not orthogonal. 

ANS. rj = (2, — 1); 

X 2 = 4, r 2 = (2, 1). 

4.7.3 If A is a 2 x 2 matrix show that its eigenvalues 2 satisfy the equation 

X 2 — a trace( A) + det A = 0. 

4.7.4 Assuming a unitary matrix U to satisfy an eigenvalue equation Ur = Ar, show 
that the eigenvalues of the unitary matrix have unit magnitude. This same result 
holds for real orthogonal matrices. 
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4.7.5 Since an orthogonal matrix describing a rotation in real three-dimensional 
space is a special case of a unitary matrix, such an orthogonal matrix can be 
diagonalized by a unitary transformation. 

(a) Show that the sum of the three eigenvalues is 1 + 2 cos (p ; where (p is the 
net angle of rotation about a single fixed axis. 

(b) Given that one eigenvalue is 1, show that the other two eigenvalues must 
be and 

Our orthogonal rotation matrix (real elements) has complex eigenvalues. 

4.7.6 A is an nth order Hermitian matrix with orthonormal eigenvectors |x f > and 
real eigenvalues X x < X 2 < X 3 < • • • < X n . Show that for a unit magnitude vector 

|y>, 

Aj < <y|A|y> < A„. 

4.7.7 A particular matrix is both Hermitian and unitary. Show that its eigenvalues 
are all ± 1 . 

Note. The Pauli and Dirac matrices are specific examples. 


4.7.8 For his relativistic electron theory Dirac required a set of four anticommuting 
matrices. Assume that these matrices are to be Hermitian and unitary. If these 
are n x n matrices, show that n must be even. With 2x2 matrices inadequate 
(why?), this demonstrates that the smallest possible matrices forming a set of 
four anticommuting, Hermitian, unitary matrices are 4x4. 

4.7.9 A is a normal matrix with eigenvalues X„ and orthonormal eigenvectors |x„>. 
Show that A may be written as 

A = ZA„|x„><x n |. 

n 

Hint. Show that both this eigenvector form of A and the original A give the 
same result acting on an arbitrary vector | y>. 

4.7.10 A has eigenvalues 1 and —1 and corresponding eigenfunctions (q) and (?). 
Construct A. 

(\ 0 \ 

ANS. A = ( 

Vo -i/ 


4.7.11 A non-Hermitian matrix A has eigenvalues A f and corresponding eigenvectors 
|u,->. The adjoint matrix A f has the same set of eigenvalues but different corre- 
sponding eigenvectors, |v f ). Show that the eigenvectors form a biorthogonal 
set in the sense that 


<v t |iiy> = 0 for Xf f Xj. 

4. 7. 1 2 You are given a pair of equations : 

A|f„> = A„|g„> 

A|g„> = K |f„> with A real. 


(a) 

(b) 

(c) 


Prove that 
Prove that 


f„> is an eigenvector of (A A) with eigenvalue X 2 n . 
g„> is an eigenvector of (AA) with eigenvalue X 2 . 


State how you know that 


1 . 

2. 

3. 


The 

The 


X 2 is real. 


f„> form an orthogonal set. 
g„> form an orthogonal set. 


4.7.1 3 Prove that A of the preceding problem may be written as 
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A=A 1 =XA„|g„><f„|, 

n 

with the |g„> and <f n | normalized to unity. 

Hint, (a) Show that A x operating on an arbitrary vector yields the same 
result as A operating on that vector. 

(b) Expand your arbitrary vector as a linear combination of |f n >. 


4.7.14 Given 


A = -L( 2 2 

Vni -4 


(a) Construct the transpose A and the symmetric forms AA and AA. 

(b) From AA|g„> = /.„ 2 jg„> 

find X„ and |g„>. Normalize the |g„>’s. 

(c) From AA|f„> = /l 2 |f„> 

find k n [same as (b)] and |f„>. Normalize the |f„)’s. 

(d) Verify that 

A|f„> = A„|g„> and A|g„> = A„|f„> 

(e) Verify that 

A = lA„|g„><f„|. 


]'■>-© and lfc> x(-v 


ANS. A = ~( l 1 

s/2\l 1 

4.7.16 This is a continuation of Exercise 4.5.12, where the unitary matrix U and the 
Hermitian matrix H are related by 

U =e ioH . 


Given the eigenvalues = 1 , X 2 = 


/a 

1 /l\ 


1! 

* 


(a) 

Construct A. 


(b) 

Verify that A 

f„> = A,|g,,>. 

(c) 

Verify that A 

g„> = 


(a) If trace H = 0, show that det U = + 1 . 

(b) If det U = 4- 1 , show that trace H = 0. 

Hint . H may be diagonalized by a similarity transformation. Then, interpreting 
the exponential by a Maclaurin expansion, U is also diagonal. The corresponding 
eigenvalues are given by Uj = exp (iahj). 

Note. These properties, and those of Exercise 4.5.12, are vital in the development 
of the concept of generators in group theory — Section 4. 1 1 . 


4.7.17 


An n x n matrix A has n eigenvalues A f . If B = e A show that B has the same 
eigenvectors as A with the corresponding eigenvalues B t given by B t — exp(v4 t ). 
Note, e is defined by the Maclaurin expansion of the exponential : 


A 2 A 3 

^1+A + ^ + ^-f 


4.7.18 A matrix P is a projection operator satisfying the condition 

P 2 = P. 

Show that the corresponding eigenvalues (p 2 ) A and p k satisfy the relation 

(/ p 2 )x = (Px) 2 = Pa- 
This means that the eigenvalues of P are 0 and 1. 
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4 . 7.19 In the matrix eigenvector, eigenvalue equation 

A|r,> = Aj|r,>, 

A is an n x n Hermitian matrix. For simplicity assume that its n real eigenvalues 
are distinct, 2] being the largest. If |r> is an approximation to jr, ), 

\r> = Ki> + £ <5i|ri>, 

i=2 

show that 


< r l A |r> 
< r | r > 


<^i 


and that the error in is of the order |^| 2 . Take |^| « 1. 

Hint. The form a complete orthogonal set spanning the ^-dimensional 
(complex) space. 


4 . 7.20 Two equal masses are connected to each other and to walls by springs as shown 
in the figure. The masses are constrained to stay on a horizontal line. 

(a) Set up the Newtonian acceleration equation for each mass. 

(b) Solve the secular equation for the eigenvectors. 

(c) Determine the eigenvectors and thus the normal modes of motion. 






K 

KAftM 


4.8 INTRODUCTION TO GROUP THEORY 

The theory of finite groups, developed originally as a branch of pure mathe- 
matics, can be a beautiful, fascinating toy. For the physicist, group theory, with- 
out any loss of its beauty, is also an extraordinarily useful tool for formalizing 
semi-intuitive concepts and for exploiting symmetries. Group theory becomes 
a useful tool for the development of crystallography and solid state physics 
when we introduce specific representations (matrices) and start calculating 
group characters (traces). A brief introduction to this area appears in Section 
4.9. Perhaps even more important in physics is the extension of group theory to 
continuous groups 1 and the applications of these continuous groups to quan- 
tum theory and the particles of high energy physics. This is the topic of Sections 
4.10 to 4.11. 

As knowledge of our physical world expanded almost explosively in the first 
third of this century, Wigner and others realized that invariance was a key con- 
cept in understanding the new phenomena and in developing appropriate 
theories. The mathematical tool for treating invariants and symmetries is group 
theory. It represents a unification and formalization of principles such as parity 
and angular momentum that are widely used by physicists. Parity is related to 


1 These are groups with an infinite number of elements. Each element depends 

on one or more parameters which vary continuously. 
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invariance under inversion. Conservation of angular momentum is a direct 
consequence of rotational symmetry, which means invariance under spacial 
rotations. Although the formal techniques of group theory may not be neces- 
sary, these powerful mathematical techniques can save much labor. Group 
theory can produce a unification that (once grasped) leads to greater simplicity. 

Definition of Group 

A group G may be defined as a set of objects or operations (called the 
elements ) that may be combined or “multiplied” to form a well-defined product 
and that satisfy the following four conditions. We label the set of elements 
a, b, c, ... : 

1 . If a and b are any two elements, then the product ab 
is also a member of the set. 

2. The defined multiplication is associative, ( ab)c — 
a(bc). This is automatic for matrix multiplication. 

3. There is a unit element / such that la — al = a for 
every element in the set. 2 

4. There must be an inverse or reciprocal of each ele- 
ment. The set must contain an element b = a~* such 
that aa~ x = a~ l a — I for each element of the set. 

In physics, these abstract conditions often take on direct physical meaning in 
terms of transformations of vectors, spinors, and tensors. 

As a very simple, but not trivial, example of a group, consider the set 1 , a, b, c 
that combine according to the group multiplication table 3 



1 

a 

i b 

\ c 

1 

1 

a 

1 

! b 

i 

j c 

a 

a 

b 

C 

1 

.! 1 

b 

b 

c 

1 

a 

c 

c 

1 

a 

b 


Clearly, the four conditions of the definition of “group” are satisfied. The 
elements a, b , c, and 1 are abstract mathematical entities, completely unre- 
stricted except for the preceding multiplication table. 

Now, for a specific representation of these group elements, let 

1 -> 1, 1, c — ► — /, (4.207) 

combining by ordinary multiplication. Again, the four group conditions are 
satisfied, and these four elements form a group. We label this group, C 4 . Since 
the multiplication of the group elements is commutative, the group is labeled 


2 Following Wigner, the unit element of a group is often labeled E , from the 
German Einheit , the unit. 

3 The order of the factors is row -column: ab = c in the indicated previous 
example. 
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commutative or abelian . Our group is also a cyclic group in that the elements may 
be written as successive powers of one element, in this case n = 0, 1 , 2, 3. Note 
that in writing out Eq. 4.207, we have selected a specific representation for this 
group of four objects, C 4 . 

We recognize that the group elements 1, /, — 1, — / may be interpreted as 
successive 90° rotations in the complex plane. Then, from Eq. 4.63 we create the 
set of four 2x2 matrices (replacing (p by — cp in Eq. 4.63 to rotate a vector 
rather than rotate the coordinates.) 


/cos (p 

R (?>)= . 

\sm cp 

and for cp = 0, n/ 2, tc, and 3nj2 we have 

'1 0> 

1, 

'-1 0 ) 
o — 1, 


-sin cp 
cos cp 


1 


B - 


A = 


C = 


0 -1 

1 0 

0 1 

-1 0 


(4.208) 


This set of four matrices forms a group with the law of combination being matrix 
multiplication. Here is a second representation, now in terms of matrices. A 
little matrix multiplication verifies that this representation is also abelian and 
cyclic. Clearly, there is a correspondence of the two representations 

1 1 <-» 1 a<r^i<^ A 1 B c<-+— (4.209) 


Homomorphism, Isomorphism 

There may be a correspondence between the elements of two groups (or 
between two representations), one-to-one, two-to-one, or many-to-one. If this 
correspondence satisfies the same group multiplication table, we say that the 
two groups are homomorphic. A most important homomorphic correspondence 
between the groups Oj and SU(2) is developed in Section 4.10. As a special 
case, if the correspondence is one-to-one, still preserving the multiplication 
table, then the groups are isomorphic . 4 In the group C 4 the two representations 
(1,/, — 1, — z) and (1 , A, B, C) are isomorphic. 

In contrast to this, there is no such correspondence between either of these 
representations of group C 4 and another group of four objects, the vierergruppe 
(Exercise 4.2.7), The vierergruppe has a multiplication table: 



I 

Vi 

V 2 

V 3 

I 

/ 

K 

V 2 

y 3 

V - 1 

K 

i 

V 3 

V 2 


v 2 

V 3 

I 

K 

y 3 


V 2 

Vi 

i 


4 Suppose the elements of one group are labeled g i9 the elements of a second 
group h { . Then g i <->h i , a. one-to-one correspondence for all values of i. Also, 
if g { g } = g k , and h t hj = h k , then g k and h k must be corresponding elements. 
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Confirming the lack of correspondence between the group represented by 
(1, /, - 1, — /) or the matrices (1 , A, B, C) of Eq. 4.208, note that although the 
vierergruppe is abelian, it is not cyclic. The cyclic group C 4 and the vierergruppe 
are not isomorphic. 


Matrix Representations — Reducible and 
Irreducible 

The representation of group elements by matrices is a very powerful tech- 
nique and has been almost universally adopted among physicists. The use of 
matrices imposes no significant restriction. It can be shown that the elements of 
any finite group and of the continuous groups of Section 4.10 may be repre- 
sented by matrices and, in particular, by unitary matrices. In quantum me- 

chanics these unitary representations assume a special importance since unitary 
matrices can be diagonalized, and the eigenvalues can serve for the classification 
of quantum states. 

If there exists a unitary transformation 5 that will transform our original 
representation matrices into a diagonal or block-diagonal form, for example, 

( ?u r i3 r 14 \ /Pn P \2 0 

r 21 r 22 r 23 r 24 P 21 P 22 ^ 

r 31 r 32 r 33 r 34 I l ® 0 #11 

r 4 i r 42 r 43 r 44y / \0 0 # 2 i 



(4.210) 


such that the smaller portions or submatrices are no longer coupled together, 
then the original representation is reducible. Equivalently, we have 


SRS” 1 = 




(4.211) 


If R is an n x n matrix, we might have P an m x m matrix, and Qan(«-m) x 
(n - m) matrix. The O’s are then rectangular matrices m x (n - m) and 
(n — m) x m with all elements zero. We may write this result as 

R = P © Q, (4.212) 

and say that R has been decomposed into the representations P and Q. For 
instance, all representations of dimension greater than 1 of Abelian groups are 
reducible. If no such unitary transformation exists, the representation is 
irreducible . Among the Dirac matrices of Table 4.1, 1, cr l9 cr 2 , <j 3 , p 3 , <5 2 , 

and <5 3 are in this reduced form. The topic of Exercise 4.8.1 is to show that the 
matrices 1 , A, B, and C form a reducible representation and to reduce them to 
the irreducible representations. The 2x2 matrix representation of the vierer- 
gruppe is likewise reducible. 

The irreducible representations play a role in group theory that is roughly 
analogous to the unit vectors of vector analysis. They are the simplest represen- 
tations — all others may be built up from them. 


5 A unitary matrix remains unitary under a unitary transformation. 
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Classes and Character 

Consider a group element x transformed into a group element y by a simi- 
larity transform with respect to g h an element of the group 

9iXgr 1= y- (4.213) 

The group element y is conjugate to x. A class is a set of mutually conjugate 
group elements. In general, this set of elements forming a class does not satisfy 
the group postulates and is not a group. Indeed, the unit element 1 which is 
always in a class by itself is the only class that is also a subgroup. All members 
of a given class are equivalent in the sense that any one element is a similarity 
transform of any other element. Clearly, if a group is abelian, every element is a 
class by itself. We find that 

1 . Every element of the original group belongs to one 
and only one class. 

2. The number of elements in a class is a factor of the 
order of the group. 

We get a possible physical interpretation of the concept of class by noting 
that y is a similarity transform of x. If g t represents a rotation of the coordinate 
system, then y is the same operation as x but relative to the new, related co- 
ordinates. 

In Section 4.3 we see that a real matrix transforms under rotation of the 
coordinates by an orthogonal similarity transformation. Depending on the 
choice of reference frame, essentially the same matrix may take on an infinity of 
different forms. Likewise, our group representations may be put in an infinity 
of different forms by using unitary transformations. But each such transformed 
representation is isomorphic with the original. From Exercise 4.3.9 the trace of 
each element (each matrix of our representation) is invariant under unitary 
transformations. Just because it is invariant, the trace (relabeled the character) 
assumes a role of some importance in group theory, particularly in applications 
to solid state physics. Clearly, all members of a given class (in a given represen- 
tation) have the same character. Elements of different classes may have the same 
character but elements with different characters cannot be in the same class. 

The concept of class is important (1) because of the trace or character and 
(2) because the number of nonequivalent irreducible representations of a group is 
equal to the number of classes. 

Subgroups and Cosets 

Frequently a subset of the group elements (including the unit element 7) will 
by itself satisfy the four group requirements and therefore is a group. Such a 
subset is called a subgroup. Every group has two trivial subgroups: the unit 
element alone and the group itself. The elements 1 and b of the four element 
group C 4 discussed earlier form a nontrivial subgroup. In Section 4. 10 we con- 
sider Of the (continuous) group of all rotations in ordinary space. The rota- 
tions about any single axis form a subgroup of Of Numerous other examples 
of subgroups appear in the following sections. 
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Consider a subgroup H with elements h { and a group element x not in H. 
Then xh t and h t x are not in subgroup H. The sets generated by 

xh x / = 1,2, ... and h t x z = 1,2, ... 

are called cosets , respectively, the left and right cosets of subgroup H with 
respect to x. It can be shown (assume the contrary and prove a contradiction) 
that the coset of a subgroup has the same number of distinct elements as the 
subgroup. Extending this result we may express the original group G as the sum 
of H and cosets : 


G — H -f - x^H + x 3 H 

Then the order of any subgroup is a divisor of the order of the group . It is this 
result that makes the concept of coset significant. In the next section the six- 
element group D 3 (order 6) has subgroups of order 1, 2, and 3. Z> 3 cannot (and 
does not) have subgroups of order 4 or 5. 

The similarity transform of a subgroup H by a fixed group element x not in 
H , xHx 1 yields a subgroup — Exercise 4.8.8. If this new subgroup is identical 
with H for all x, 

xHx~ l = H , 

then His called an invariant , normal , or self-conjugate subgroup. Such subgroups 
are involved in the analysis of multiplets of atomic and nuclear spectra and the 
particles discussed in Section 4.12. All subgroups of a commutative (abelian) 
group are automatically invariant. 


EXERCISES 

4 . 8.1 Show that the matrices 1 , A, B, and C of Eq. 4.208 are reducible, Reduce them. 
Note. This means transforming A and C to diagonal form (by the same unitary 
transformation). 

Hint. A and C are anti-Hermitian. Their eigenvectors will be orthogonal. 

4 . 8.2 Possible operations on a crystal lattice include A n (rotation by iz)> m (reflection), 
and i (inversion). These three operations combine as 

A 2 = m 2 = i 2 =\, 

A n *m~i , m- i — A n , and i^A n — m. 

Show that the group (l,A n ,m, i ) is isomorphic with the vierergruppe . 

4 . 8.3 Four possible operations in the xy-plane are: 

f X — ► X 

1 , no change ) 

1 y-*y 

(x -> —x 

2. inversion 1 

-y 


3. reflection < 

ly-+y 
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4. reflection < 

[y-> - y • 

(a) Show that these four operations form a group. 

(b) Show that this group is isomorphic with the vierergruppe. 

(c) Set up a 2 x 2 matrix representation. 

4 . 8.4 Rearrangement theorem. 

Given a group of n distinct elements (/, a,b,c, ... , «), show that the set of products 
(al, a 2 , ab , ac, . . . , an) reproduces the n distinct elements in a new order. 

4 . 8.5 Using the 2 x 2 matrix representation of Exercise 4.2.7 for the vierergruppe , 

(a) Show that there are four classes, each with one element. 

(b) Calculate the character (trace) of each class. Note that two different classes 
may have the same character. 

(c) Show that there are three two-element subgroups. (The unit element by 
itself always forms a subgroup.) 

(d) For any one of the two-element subgroups show that the subgroup and a 
single coset reproduce the original vierergruppe. 

Note that subgroups, classes, and cosets are entirely different. 

4 . 8.6 Using the 2 x 2 matrix representation, Eq. 4.208, of C 4? 

(a) Show that there are four classes, each with one element. 

(b) Calculate the character (trace) of each class. 

(c) Show that there is one two-element subgroup. 

(d) Show that the subgroup and a single coset reproduce the original group. 

4 . 8.7 Prove that the number of distinct elements in a coset of a subgroup is the same 
as the number of elements in the subgroup. 

4 . 8.8 A subgroup H has elements h t . x is a fixed element of the original group G and 
is not a member of H. The transform 

xh t x _1 i = 1, 2, ... 

generates a conjugate subgroup xHx 1 . Show that this conjugate subgroup satisfies 
each of the four group postulates and therefore is a group. 

4 . 8.9 (a) A particular group is abelian. A second group is created by replacing g t by 

gl 1 for each element in the original group. Show that the two groups are 
isomorphic. 

Note. This means showing that if a & = c h then, a^bf 1 = cj 1 . 

(b) Continuing part (a), if the two groups are isomorphic, show that each must 
be abelian. 


4.9 DISCRETE GROUPS 

In physics, groups usually appear as a set of operations that leave a system 
unchanged, invariant. This is an expression of symmetry. Indeed, a symmetry 
may be defined as the invariance of the Hamiltonian of a system under a group 
of transformations. Symmetry in this sense is important in classical mechanics, 
but it becomes even more important and more profound in quantum mechanics. 
In this section we investigate the symmetry properties of sets of objects (atoms 
in a molecule or crystal). This provides additional illustrations of the group 
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concepts of Section 4.8 and leads directly to dihedral groups. The dihedral 
groups in turn open up the study of the 32 point groups and 230 space groups 
that are of such importance in crystallography and solid state physics. It might 
be noted that it was through the study of crystal symmetries that the concepts of 
symmetry and group theory entered physics. 


Two Objects — Twofold Symmetry Axis 

Consider first the two-dimensional system of two identical atoms in the xy- 
plane at (1,0) and (— 1, 0), Fig. 4.8. What rotations 1 can be carried out (keeping 
both atoms in the xy-plane) that will leave this system invariant? The first 
candidate is, of course, the unit operator 1 . A rotation of n radians about the 
z-axis completes the list. So we have a rather uninteresting group of two mem- 
bers (1, — 1). The z-axis is labeled a twofold symmetry axis — corresponding to 
the two rotation angles 0 and n that leave the system invariant. 



Our system becomes more interesting in three dimensions. Now imagine a 
molecule (or part of a crystal) with atoms of element X at ±a on the x-axis, 
atoms of element Y at ±b on the y-axis, and atoms of element Z at ±con the 
z-axis as shown in Fig. 4.9. Clearly, each axis is now a twofold symmetry axis. 
Using R x (n) to designate a rotation of n radians about the x-axis, we may set up 
a matrix representation of the rotations as in Section 4.3 : 


( l 0 °\ 

R,(*) = |o -l ol 

\o o -l / 

/-I 0 0\ 

R,(«) = I 0 -i o ) 
Vo 0 1/ 





(4.214) 


1 Here we deliberately exclude reflections and inversions. They must be 
brought in to develop the full set of 32 point groups. 
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These four elements [1, R x (7i), R v (7i), R z (ti)] form an abelian group with a 
group multiplication table : 



1 

R*(«) 

RyOO 

R 2 W 

1 

1 

Rjc 

Ry 

R z 

Rx(*) 

R* 

1 

R. 

Ry 

R,(*) 

R> 

R : 

1 

Ry 

RzO) 

R_. 

R> 

R, 

1 


The products shown in this table can be obtained in either of two distinct 
ways : (1) We may analyze the operations themselves — a rotation of n about the 
x-axis followed by a rotation of n about the j-axis is equivalent to a rotation of 
n about the z-axis: (R y (n) R x (7c) = R 2 (7r). (2) Alternatively, once the matrix 
representation is established, we can obtain the products by matrix multiplica- 
tion. This is where the power of mathematics is shown — when the system is too 
complex for a direct physical interpretation. 

Comparison with Exercises 4.2.7, 4.8.2, or 4.8.3 shows immediately that this 
group is the vierergruppe . The matrices of Eq. 4.214 are isomorphic with those 
of Exercise 4.2.7. Also, they are obviously reducible — being diagonal. The 
subgroups are (1 , R x ), (1 , R } ,) and (1 , R z ). They are invariant. It should be noted 
that a rotation of n about the y-axis and a rotation of n about the z-axis is 
equivalent to a rotation of n about the x-axis. R z (7c) R y (7c) = R x (n). In symmetry 
terms, if y and z are twofold symmetry axes, x is automatically a twofold sym- 
metry axis. 

This symmetry group, 2 the vierergruppe , is often labeled Z> 2 , the D signifying 
a dihedral group and the subscript 2 signifying a twofold symmetry axis (and no 
higher symmetry axis). 


2 A symmetry group is a group of symmetry-preserving operations, that is, 
rotations, reflections, and inversions. A symmetric group is the group of 
permutations of n distinct objects — of order n ! 
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Three Objects — Threefold Symmetry Axis 

Consider now three identical atoms at the vertices of an equilateral triangle, 
Fig. 4.10. Rotations of the triangle of 0, 2n/3, and 4n/3 leave the triangle in- 
variant. In matrix form, we have 3 


1 = R 2 (0) = 


1 0 


,0 1 

A = R z (27t/3) = 

B = R 2 (4tt/3) = 


( cos27i/3 

— sin 2nji 

^sin 2n/3 

cos27r/3 

( -1/2 

V?/2\ 

V-V3/2 

— 1/2/ 



-x/3/2\ 

- 1/2 / 


(4.215) 


The z-axis is a threefold symmetry axis. (1 , A, B) form a cyclic group, a sub- 
group of the complete six-element group that follows. 

In the xy - plane there are three additional axes of symmetry — each atom 
(vertex) and the geometric center defining an axis. Each of these is a twofold 
symmetry axis. These rotations may most easily be described within our two- 
dimensional framework by introducing reflections. The rotation of n about the 
C or y-axis, which means the interchanging of atoms a and c, is just a reflection 
of the x-axis: 


3 Note that here we are rotating the triangle counterclockwise relative to 
fixed coordinates. 



-1 0 

0 1 
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C = 


R c (7t) = 


(4.216) 


We may replace the rotation about the D-axis by a rotation of 4 ji/ 3 (about our 
z-axis) followed by a reflection of the .v-axis (a -> — x) (Fig. 4.11): 


D = R B (7t) = CB 

-1 0\ / — 1/2 ^3/2\ 

0 lJV-V3/2 — 1/2/ 

= ( 1/2 - V 3/2\ 

V-V3/2 —1/2 / 



(4.217) 


FIG. 4.11 The triangle on the right is the triangle on the left rotated 180° about 
the D-axis. D = CB. 


In a similar manner, the rotation of n about the 2i-axis interchanging a and b is 
replaced by a rotation of 2n/3 (A) and then a reflection 4 of the x-axis (x -► — x) : 


E = R E (n) = CA 

= /-l 0W— 1/2 -V3/2\ 
V 0 lJVV3/2 -1/2 ) 
J 1/2 V3/2\ 

VV3/2 — 1/2/ 

The complete group multiplication table is 


(4.218) 



1 

A 

B 

C 

D 

E 

1 

1 

A 

B 

C 

D 

E 

A 

A 

B 

1 

D 

E 

C 

B 

B 

1 

A 

E 

C 

D 

C 

C 

E 

D 

1 

B 

A 

D 

D 

C 

E 

A 

1 

B 

E ! 

E 

D 

C 

B 

A 

1 


Notice that each element of the group appears only once in each row and in each 
column — as required by the rearrangement theorem, Exercise 4.8.4. Also, from 
the multiplication table the group is not abelian. We have constructed a six- 


4 Note that, as a consequence of these reflections, det(C) = det(D) = det(E) = 
— 1. The rotations A and B, of course, have a determinant of -h 1 . 
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element group and a 2 x 2 irreducible matrix representation of it. The only 
other distinct six-element group is the cyclic group [1 , R, R 2 , R 3 , R 4 , R 5 ] with 


/ cos rc/3 —sin tt/3\ / 1/2 -y/3/2\ 

\sin7t/3 cos tt/3 J \ v ; '3/2 1/2 ) 


(4.219) 


Our group [1 , A, B, C, D, E] is labeled D 3 in crystallography, the dihedral 
group with a threefold axis of symmetry. The three axes (C, D, and E) in the 
xy-plane automatically become twofold symmetry axes. As a consequence, 
(1 , C), (1 , D), and (1 , E) all form two-element subgroups. None of these two- 
element subgroups of Z> 3 is invariant. 

There are two other irreducible representations of the symmetry group of the 
equilateral triangle : (2) the trivial (1, 1, 1, 1, 1, 1), and (2) the almost as trivial 
(1, 1, 1, — 1, — 1, — 1), the positive signs corresponding to proper rotations and 
the negative signs to improper rotations (involving a reflection). Both of these 
representations are homomorphic with Z> 3 . 

A general and most important result for finite groups of h elements is that 

!>? = /*, (4.220) 


where n { is the dimension of the matrices of the ith irreducible representation. 
This equality, sometimes called the dimensionality theorem, is very useful in 
establishing the irreducible representation of a group. Here for D 3 we have 
l 2 + l 2 + 2 2 = 6 for our three representations. No other irreducible represen- 
tations of the symmetry group of three objects exist. 


Dihedral Groups, D n 

A dihedral group D n with an «-fold symmetry axis implies n axes with angular 
separation of 2n/n radians, n is a positive integer, but otherwise unrestricted. If 
we apply the symmetry arguments to crystal lattices , then n is limited to 1 , 2, 3, 4, 
and 6. The requirement of invariance of the crystal lattice under translations in 
the plane perpendicular to the «-fold axis excludes n = 5, 7, and higher values. 
Try to cover a plane completely with identical regular pentagons and with no 
overlapping. 5 For individual molecules, this constraint does not exist, although 
the examples with n > 6 are rare, n = 5 is a real possibility. As an example, the 
symmetry group for ruthenocene, (C 5 H 5 ) 2 Ru, illustrated in Fig. 4.12, is D 5 . 6 

Crystallographic Point and Space Groups 
The dihedral groups just considered are examples of the crystallographic 
point groups. A point group is composed of combinations of rotations and 
reflections (including inversions) that will leave some crystal lattice unchanged. 
Limiting the operations to rotations and reflections (including inversions) 
means that one point — the origin — remains fixed , hence the term point group. 


5 For D 6 imagine a plane covered with regular hexagons and the axis of 
rotation through the geometric center of one of them. 

6 Actually the full technical label is D 5h , the /z, indicating invariance under a 
reflection of the fivefold axis. 
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Including the cyclic groups, two cubic groups (tetrahedron and octahedron 
symmetries), and the improper forms (involving reflections), we come to a total 
of 32-point groups. 

If, to the rotation and reflection operations that produced the point groups, 
we add the possibility of translations and still demand that some crystal lattice 
remain invariant, we come to the space groups. There are 230 distinct space 
groups, a number that is appalling except, possibly, to specialists in the field. 
For details (which can cover hundreds of pages) see the references. 


EXERCISES 

4.9.1 (a) Once you have a matrix representation of any group, a one-dimensional 

representation can be obtained by taking the determinants of the matrices. 
Show that the multiplicative relations are preserved in this determinant 
representation. 

(b) Use determinants to obtain a one-dimensional representative of Z) 3 . 

4.9.2 Explain how the relation 

= h 

i 

applies to the vierergruppe ( h — 4) and to the dihedral group, D 3 (h — 6). 

4.9.3 Show that the subgroup (1 , A, B) of Z) 3 is an invariant subgroup. 

4.9.4 The group D 3 may be discussed as a permutation group of three objects. Matrix 
B, for instance, rotates vertex a (originally in location 1) to the position formerly 
occupied by c (location 3). Vertex b moves from location 2 to location 1, and 
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so on. As a permutation ( a b c) — ► (6 c a). In three dimensions 




(a) Develop analogous 3x3 representations for the other elements of Z> 3 . 

(b) Reduce your 3x3 representation to the 2 x 2 representation of this section. 
(This 3x3 representation must be reducible or Eq. 4.220 would be violated.) 
Note. The actual reduction of a reducible representation may be awkward. It 
is often easier to develop directly a new representation of the required dimension. 


4 . 9.5 (a) The permutation group of four objects, P 4 , has 4! = 24 elements. Treating 

the four elements of the cyclic group, C 4 , as permutations, set up a 4 x 4 
matrix representation of C 4 . C 4 becomes a subgroup of P 4 . 

(b) How do you know that this 4x4 matrix representation of C 4 must be 
reducible? 

Note. C 4 is abelian and every abelian group of h objects has only h one- 
dimensional irreducible representations. 


4 . 9.6 (a) The objects {abed) are permuted to ( d a c b). Write out a 4 x 4 matrix 

representation of this one permutation. 

(b) Is perrrmtation, {a b d c)-*{d a c b ), odd or even? 

(c) Is this permutation a possible member of the D 4 group? Why or why not? 


4 . 9.7 The elements of the dihedral group D n may be written in the form 

S a R£(2 n/n), 2 = 0,1 

\x~ 0, 1 , 1 , 

where R z (2n/n) represents a rotation of Injn about the n-fold symmetry axis, 
whereas S represents a rotation of n about an axis through the center of the 
regular polygon and one of its vertices. 

For S = E show that this form may describe the matrices A, B, C, and D of Z> 3 . 
Note. The elements R 2 and S are called the generators of this finite group. 
Similarly, i is the generator of the group given by Eq. 4.207. 

4 . 9.8 Show that the cyclic group of n objects, C„, may be represented by r m , m ~ 0, 1, 
2, . . . , n — 1 . Here r is a generator given by 

r — exp(27r isjri). 

The parameter s takes on the values s — 1, 2, 3, . . . , n, each value of s yielding 
a different one-dimensional (irreducible) representation of C n . 


4 . 9.9 Develop the irreducible 2x2 matrix representation of the group of operations 
(rotations and reflections) that transform a square into itself. Give the group 
multiplication table. 

Note. This is the symmetry group of a square and also the dihedral group, Z> 4 . 
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4 . 9.10 The permutation group of four objects contains 4! = 24 elements. From Ex. 

4.9.9, Z) 4 , the symmetry group for a square, has far less than 24 elements. Explain 

the relation between D 4 and the permutation group of four objects. 

4 . 9.1 1 A plane is covered with regular hexagons, as shown. 

(a) Determine the dihedral symmetry of an axis perpendicular to the plane 
through the common vertex of three hexagons ( A ). That is, if the axis has 
w-fold symmetry, show (with careful explanation) what n is. Write out the 
2x2 matrix describing the minimum (nonzero) positive rotation of the 
array of hexagons that is a member of your D n group. 

(b) Repeat part (a) for an axis perpendicular to the plane through the geometric 
center of one hexagon (B). 



4 . 9.12 In a simple cubic crystal, we might have identical atoms at r — ( la,ma,na ), /, 
m, and n taking on all integral values. 

(a) Show that each cartesian axis is a fourfold symmetry axis. 

(b) The cubic group will consist of all operations (rotations, reflections, in- 
version) that leave the simple cubic crystal invariant. From a consideration 
of the permutation of the positive and negative coordinate axes, predict 
how many elements this cubic group will contain. 

4 . 9.13 (a) From the D 3 multiplication table construct a similarity transform table 

showing xyx~ l , where x and y each range over all six elements of D 3 : 



(b) Divide the elements of D 3 into classes. Using the 2x2 matrix representation 
of Eqs. 4.215 to 4.218 note the trace (character) of each class. 


4.10 CONTINUOUS GROUPS 

Infinite Groups, Lie Groups 

All of the groups in the two preceding sections have contained a finite num- 
ber of elements: four for the vierergruppe , six for D 3 , and so on. Flere we intro- 
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duce groups with an infinite number of elements. The group element will con- 
tain one or more parameters that vary continuously over some range. The 
continuously varying parameter gives rise to a continuum of group elements. In 
contrast to the four-member cyclic group (1, i, — 1 , — i), we might have e l<p , with 
(p varying continuously over the range [0, 2 tt]. The O3 and SU(2) groups 
described subsequently are additional examples. 

Among the various mathematical possibilities, the continuous groups known 
as Lie groups are of particular interest. The characteristic of a Lie group is that 
the parameters of a product element are analytic functions 1 of the parameters of 
the factors. In the case of transformations, a rotation, for instance, we might 
write 

xl =f(x i,x 2 ,x 3 ,0) (4.221) 

(compare Eq. 1.9). For this transformation group to be a Lie group the func- 
tions f must be analytic functions of the parameter 0 . This will be true for the 
O3 and SU(2) groups considered here and in Section 4.11, for SU(3) en- 
countered in Section 4.12, and for the Lorentz group of Section 4.13. All are 
Lie groups. The analytic nature of the functions (differentiability) allows us to 
develop the concept of generator (Section 4.1 1) and to reduce the study of the 
whole group to a study of the group elements in the neighborhood of the identity 
element. 

If these parameters vary over closed intervals such as [0,7i], or [0, 2 tt] for 
angles, the group is compact . An important property of this is that every repre- 
sentation of a compact group is equivalent to a unitary representation. In con- 
trast, the homogeneous Lorentz group of Section 4.13 is not compact and the 
representation L(v) is not unitary. 

We now consider two continuous groups: (1) the orthogonal group O3 and 
(2) the special unitary group SU(2). A representation of O3 is obtained from 
Section 4.3. For SU(2) a (2 j + 1) x (2 j 4- 1) representation is developed — Eq. 
4.235. Then these two groups are shown to be homomorphic, a two-to-one 
correspondence. From this homomorphism the SU(2) representation provides 
a series of representations of rotations and leads to the rotation matrix D J . 


Orthogonal Group, O 3 

The set of n x n real orthogonal matrices forms a group. (Check to see that 
the group properties of Section 4.8 are satisfied.) Our n x n matrix has 
n(n — l)/2 independent parameters. For n = 2, there is only one independent 
parameter : one angle in Eq. 4.63. For n = 3 there are three independent parame- 
ters: the three Euler angles of Section 4.3. 

We consider in some detail the set of 3 x 3 real orthogonal matrices with a 
determinant +1 — rotations only, no reflections. This group is frequently 
labeled O3 , the 4- indicating that the determinant is 4- 1. From Section 4.3 the 
rotations about the coordinate axes are 


1 Analytic, defined in Section 6.2, means having derivatives of all orders. 
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R x (fP) 


0 

0 \ 

/cosd 

0 

— sin0 

cos (p 

sin^ J, 

R,(0) = [ 0 

1 

0 

— sirup 

cos ip) 

V sin 0 

0 

cos# 


cos ijs sin \j/ 0 y 

R — ( — sin^ cosi// 0 
0 0 1 


(4.222) 


We are following the conventions of Section 4.3. The rotations are counter- 
clockwise rotations of the coordinate system to a new orientation. Also, from 
Section 4.3 the general member of O3 is the Euler angle rotation 


A(a J 9 y) - R = R,(y)R,(/J)R,(a). (4.223) 

The relation of the O3 group and orbital angular momentum is developed in 
Section 4.11. O3 also appears in Section 4.12 leading into SU(3) and particle 
physics. 


Special Unitary Group, SU(2) 

The set of n x n unitary matrices also forms a group. (Again, check to see 
that the group properties are satisfied.) This group is often labeled U(«). We 
impose the additional restriction that the determinant of the matrices be + 1 and 
obtain the special unitary or unitary unimodular group , SU(n). Our n x n 
unitary, unit determinant matrix has n 2 — 1 independent parameters. For n = 2 
there are three parameters — the same as for Oj. For n = 3 there are eight 
parameters. This will become the eightfold way of Section 4.12. 

For n = 2 we have SU(2) with a general group element 


U = 



(4.224) 


with a* a + b*b — 1. As indicated, a and b are complex. These parameters are 
often called the Cayley-Klein parameters, having been introduced by Cayley 
and Klein in connection with problems of rotation in mechanics. Although not 
quite so obvious, an alternate general form is 


U(£,*,C) 


/ e^cosg easing' 
e ~ lC sin g e~ ^ cos rj 


(4.225) 


with the three parameters £, rf , and ( real. Both these forms, Eqs. 4.224 and 
4.225, may be checked by showing that UlF = 1 . 

Now let us determine the irreducible representations of SU(2). Returning to 
Eq. 4.224, we see that U describes a transformation of a two-component com- 
plex column vector (called a spinor) : 





(4.226) 


or 
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u' — au 4- bv , 
v' — —b*u H- a*v. 


(4.227) 


From the form of this result, if we were to start with a homogeneous polynomial 
of the nth degree in u and v and carry out the unitary transformation, Eq. 4.227, 
we would still have a homogeneous rcth-degree polynomial. This is significant 
in that the n + 1 terms u n , u n ~ 1 v , u n ~ 2 v 2 , and so on belong to an ( n 4- 1)- 
dimensional representation of our special unitary group. 

To save algebraic juggling, we follow the choice of Wigner and let n — 2/ and 
consider the (monomial) function 


fju, V ) 


u j+m v j-m 

J(J + m)\(j - m)l 


(4.228) 


The index m will range from —j to -by, covering all terms of the form u p v q with 
p 4 - q — 2j. The denominator is a sort of normalizing factor that will make our 
representation unitary. If we take the action of U on / m (w, v) to be 2 


U f m (u, v) = fju\v% 


(4.229) 


then 


U f m (u, v) = f m (au + bv,-b*u + a*v ) 

_ (au + bvy +m (—b*u -b a*v) j ~ m (4.230) 

V (j + m)\(j - m)\ 


Now the job is to express the right-hand side of Eq. 4.230 as a linear combina- 
tion of terms of the form of f m (u,v). The coefficients in the linear combination 
will give us the desired representation. We expand the two binomials by the 
binomial theorem (Section 5.6), obtaining 




a i+m-k u j+m-k b k v k 


-b*u + a*v) J ~ m = V m (- — ’^—-b* u - m - ,) u i ~ m ' l a*‘v l . 

i% ll(J — m — l)\ 


(4.231) 


Then 


M fm(u,v) 


’y j y VQ- + m)!q-m)! 

£ 01=0 kW.U + m - k)\(j - m - /)! 

x a^ + ^ a^i b^ b^^j ^ ^u 2 ^ ^ 


(4.232) 


If we let 7 — k — l — m', 

u 2j-k-l v k+l u j+m v j-m ^ 


(4.233) 


2 In Section 4.11 the transformation (rotation) of a function is defined in 
terms of the inverse rotation of the coordinates. Here we use Eq. 4.229 since 
we are setting up a comparison with 0 3 , which is described in terms of rota- 
tions of the coordinates — Eq. 4.222. 
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matching the form of Eq. 4.228. Replacing the summation over / by a summa- 
tion over m', 

u /„(«,»)= t (4.234) 

m' = 

where the matrix element U mm . is given by 

rj vO + w )!(2- m)\(J + mTjMJ - my. 

mm fc=o &!C/ - rn' - k)](j + m- - m + k)\ (4 235 ^ 

X a j +m ~ k a *tt~m' -k)fokb*{m' ~m+k) 


The index k starts with zero and runs up to j + ra, but the factorials 3 in the 
denominator guarantee that the coefficient will vanish if any exponent goes 
negative. 

Equation 4.234 shows that the effect of U operating on f m is given by a linear 
combination of f m , with coefficients U mm .. This is the same as the rotation 
operator discussed at the beginning of Section 4.2. The rotation operator was 
represented by the matrix A. Here the operator U is represented by the matrix 
of elements U mm >. Since m and m' each range from —j to +/ in unit steps, our 
matrices ( U mm ) representing SU(2) have dimensions (2 j -F 1 ) x (2/ — 1). 

To be a little more specific about this — if j = 


m 


_ i 


m = — 


U = 


m — | 


m — — -k 




(4.236) 


identical with Eq. 4.224. The cases for j — 1 and up are most conveniently 
handled with trigonometric functions, as shown subsequently. 


SU( 2 ) - O 3 Homomorphism 

As just seen, the elements of SU(2) describe rotations in a two-dimensional 
complex space. (The invariance of s r s , Exercise 4.10.6, suggests a “rotation” 
of the spinor *?, Eq. 4.226.) The determinant is T 1 . There are three independent 
parameters. Our real orthogonal group OJ, determinant + 1 , clearly describes 
rotations in ordinary three-dimensional space with the important characteristic 
of leaving x 2 + y 2 + z 2 invariant. Also, there are three independent parameters. 
The rotation interpretations and the equality of numbers of parameters suggest 
the existence of some sort of correspondence between the groups SU(2) and 
Oj . Here we develop this correspondence. 

The operation of SU(2) on a matrix is given by a unitary transformation, 
Eq. 4.122, 

IVT = UMlE. (4.237) 

Taking M to be a 2 x 2 matrix, we note that any 2x2 matrix may be written 
as a linear combination of the unit matrix and the three Pauli matrices of 


From Section 10.1 (— n) \ = ±00 for n = 1, 2, 3, . . . . 
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Section 4.2. Let M be the zero-trace matrix, 

/ z x — iy\ 

M = x(j i 4- yo 2 4- 20 ^ = [ J >> (4.238) 

\x + iy z J 

the unit matrix not entering. Since the trace is invariant under a unitary trans- 
formation (Exercise 4.3.9), IVT must have the same form, 

( z' x! — iy' \ 

M' = x’Oi + y'o 2 + z'a 3 = 1 ) . (4.239) 

V* + iy' -2 J 

The determinant is also invariant under a unitary transformation (Exercise 
4.3.10). Therefore 

~(x 2 + y 2 -h z 2 ) = ~(x ' 2 + / 2 4 z' 2 ), (4.240) 

or x 2 4 - y 2 4* z 2 is invariant under this operation of SU(2), just as with Oj . 
SU(2) must, therefore, describe a rotation. This suggests that SU(2) and O 3 
may be isomorphic or homomorphic. 

We approach the problem of what rotation SU(2) describes by considering 
special cases. Returning to Eq. 4.224 with one eye on Eq. 4.225, let a = e 1 ^ 
and b = 0 , or 

(e* 0 \ 

u -=U A (424,) 


In anticipation of Eq. 4.245, this U is given a subscript 2 . 

Carrying out a unitary transformation on each of the three Pauli cr’s, we 
have 




'e* 0\/0 1 \/<T* 0 \ 

v 0 e~*y \1 0/\0 ey 



(4.242) 


We reexpress this result in terms of the Pauli a r s to obtain 
Similarly, 


U z xa 1 Uj = xcos 2 £ > a l — xsin 2 £a 2 . 

(4.243) 

U z ya 2 U l~ y sin 2£a 1 4- y cos 2 to 2 
U z za 3 U f z = za 3 . 

(4.244) 


From these double angle expressions we see that we should start with a half- 
angle: £ = a/2. Then, from Eqs. 4.237-4.239, 4.243, and 4.244 


x' = xcosa 4 - 4 sin a 


y' = — x sin a + y cos a 


(4.245) 


2' = z. 


The 2 x 2 unitary transformation using U.(a/2) is equivalent to the rotation 
operator R(a) of Eq. 4.222. 
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The establishment of the correspondence of 

cos P/2 sin P/2 


U,(/*/2) = 


— sin^/2 cos/?/2 


and R (/?) and of 


U,(W2)H . 


cos<p/2 zsin<p/2 


/ sin (p/2 cos <p / 2 


(4.246) 


(4.247) 


and R x (<p) are left as Exercise 4.10.7. The reader might note that U k (^/2) has 
the general form 


U fc (i///2) = 1 cos \jjj2 + /a k sin^/2, 

where k = x, jr, 2 . We return to this point in Section 4. 1 1 . 
The correspondence 


(4.248) 


U z (a/2) = 


Ml 2 


0 

9 -i*f2 i 


\ j 

/ cos a 

sin a 

°\ 


1 

— sm a 

cos a 

0 

/ 1 

l '0 

0 

1/ 


is not a simple one-to-one correspondence. Specifically, as a in R z ranges from 
0 to 2 tt, the parameter in U z , a/2, goes from 0 to n. We find 


R z (a + 2n) — R z (a) 
U z (a/2 + 7i) = (^ Q 



U z (a/2). 


(4.250) 


Therefore both U z (a/2) and U z (a/2 + 7 c) = — U z (a/2) correspond to R z (a). The 
correspondence is 2 to 1, or SU(2) and Oj are homomorphic. This establishment 
of the correspondence between the representations of SU(2) and those of O 3 
means that the known representations of SU (2) automatically provide us with 
the representations 4 of O 3 . 

Combining the various rotations, we find that a unitary transformation 
using 


U(a,j5,y) = U z (y/2)U,(jS/2)U z (a/2) (4.251) 


corresponds to the general Euler rotation R z (y) R v (jS) R z (a). By direct 
multiplication, 


U(a,j8, y) 


( e lyl 2 0 \ / cos/?/2 sin/?/2\ /c la/2 0 \ 

\ 0 e~ lyl2 ) \ — sin p/2 cos/?/2/ \ 0 c " la/ 2 ) 

/ e i(y+<*)/2 CQS pj2 e i(y-*)/2 s j n pj2 \ 

\— e ~ l(y ~ ct)/2 sin fi/2 e ~ l{y+cc)/2 cos P/2/ 


(4.252) 


4 Whereas SU(2) has representations for integral and half odd integral values 
of j (j = 0,y, 1,|, . . . ), Of is limited to integral values of j (J = 0, 1, 2, . . . ). 
Further discussion of this point — the relation between Of and orbital angular 
momentum — appears in Sections 4.11 and 12.7. 
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This is our alternate general form, Eq. 4.225, with 

Z = (y + 00/2, n = p/ 2 y C = (y - a)/2. 

From Eq. 4.252 we may identify the parameters of Eq. 4.225 as 

a = e i{y+a)/1 cos P /2 
b = e i<?-«)/2 sin pj 2 f 

With these, our SU(2) representation U mm r of Eq. 4.235 becomes 

,, , R . v m r 1 \k VC/ + m ) ! (7 - "0 ! Q' + Wp ! U - m') ! 

Z q ( 1) Jfe!(y- _ _ fc)!C/ + m — *:)!(»*' — m -H ^)! 


(4.253) 


(4.254) 


x e 


imy 


n\2j+m-m'-2k / /J 

cos^ 1 l-shr 


m' —m + 2k 


(4.255) 


Here are our irreducible representations in terms of the Euler angles. The 
importance of Eq. 4.255 is that it allows us to calculate the (2 j + 1) x (2/ + 1) 
irreducible representations of SU(2) for all j O’ = 0,y, l,f, . . .) and the 
irreducible representations of O3 for integral orbital angular momentum j 

O = o, 1,2, ...). 

Rotation Matrix D J (a, p , y) 

In the quantum mechanics literature it is customary to take the adjoint of 
Umm- defining 5 

Di,.Jx,p,y)=U: m ,(y.,p,y). (4.256) 

For 2=0 

D°(a,j8,y) = 1. (4.257) 

For 7 


_ i 


m = \ 


1 


m = 


e m/2 cos P/2 e 


-iy/2 


e la/2 sin P/2 e iyl2 
For y=l Eqs. 4.255 and 4.256 lead to 

D^a, p, y) = 


m 


1 


m' = 0 

m' = — 1 


\ 


m — 1 

-fa l +COS P e - iy 
2 

sinp iv 

V2 

.fal - COS B- iy 


m = 0 


m = — \ 

-e' 101/2 sin P/2e iy/2 
e lCL/1 cos PI 2e iy/1 


m = — 1 


(4.258) 


.-■-sin P ie , l - cos B iy 

V2 2 


COS P 

-fa sin P 

V2 


sin P iy 

-fal + COS P jy 


(4.259) 


/ 


5 The reason for this is that U mm > is defined here in terms of rotations of 
coordinates. D J m m is used to rotate functions. Further discussion of this point 
appears in Section 4.11. 
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z,z 



For j = /, integral, the operation of the rotation matrix 
harmonics (Section 12.6) is given by 


angle rotations (y = 0) 
D' on the spherical 


W>') = I D l mm (<x, P, y) Y™ (6, (p). 6 (4.260) 

m'=—l 

The point (6\ cp') is the same point in space as (6, q > ) but measured relative to 
the rotated coordinate system rather than relative to the initial system. This 
rotated system is specified by the three Euler angles : a, /?, and y. The rotation 
matrix D'(a,j8, y) rotates the Tf"(0, cp) the way A(a,/?, y), Eq. 4.87, rotates the 
coordinates. The first two Euler angles a and p define a new polar axis, z" in 
Fig. 4.13 and a new zero of azimuth. (The third Euler angle y corresponds to a 
rotation about the new polar axis and is irrelevant here.) The point (6\ cp') is the 
same point in space as (0, cp ), but is measured relative to the rotated coordinate 
system rather than relative to the initial system. Eq. 4.260 has a wide variety 
of applications, ranging from the angular correlation of nuclear radiations to 
the relation between the body fixed axes of a rotating solid to the space fixed 
axes. 

Note the analogy with the homogeneous functions / m (w, v ) of Eq. 4.228. The 
spherical harmonics Y™(d, cp) expressed in cartesian coordinates are homoge- 
neous functions of x, y, and z. (Each term of r l Y™ has the form x a y b z c with 
a + b + c = /.) Thus Eq. 4.260 is the analog of Eq. 4.229. 

One immediate application of the rotation matrix D 7 is in the proof of the 
spherical harmonic addition theorem, Exercise 4.10.11. For further details of 
D j the reader should consult the text by Rose, cited in the references at the end 
of this chapter. 


6 The proof of this equation hinges on the identification of D l m m as a matrix 
element of the rotation operator (exp( — m*I4), Section 4.12) with the 

spherical harmonics taken as the basis functions. 
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EXERCISES 

4.1 0.1 Show that an n x n orthogonal matrix has n(n — l)/2 independent parameters. 
Hint . The orthogonality condition, Eq. 4.60, provides constraints. 

4.1 0.2 Show that an n x n special unitary matrix has n 2 — 1 independent parameters. 
Hint. Each element may be complex — doubling the number of possible para- 
meters. Some of the constraining equations are likewise complex — and count 
as two constraints. 

4.10.3 The special linear group SL(2) consists of all 2 x 2 matrices (with complex 
elements) having a determinant of -f 1. Show that such matrices form a group. 
Note. The SL(2) group can be related to the full Loren tz group. Section 4.13 
much as the SU(2) group is related to Oj 

4.1 0.4 Show that R 2 is (or is not) an invariant subgroup of Oj . 

4.1 0.5 Prove that the general form of a 2 x 2 unitary, unimodular matrix is 



with a*a 4- b*b = 1. 

4.10.6 Denoting the spinor (a, v) of Eq. 4.226 by s, show that s y s = s' f s\ the length 
of the spinor, is conserved under the transformation U. 

4.10.7 (a) Show that U y (p/2) corresponds to R y (p). 

(b) Show that U x (<p/2) corresponds to R x (<p). 

4.1 0.8 (a) Show that the a and y dependence of D J (oc, p, y) may be factored out such 

that 

D J (a, p,y) = A j (a)6 j (p)C j (y ). 

(b) Show that A J (a) and C j (y) are diagonal. Find the explicit forms. 

(c) Show that d j (p) = D j (0, ft 0). 

Hint . Exercises 4.2.28 and 4.2.29 may be helpful. 

4.1 0.9 By inspection of Eqs. 4.255 and 4.256, or the special cases, Eqs. 4.258 and 4.259, 

D jt (a, ft y) = D j ( — y, — /?, — a). 

Explain why this should be so. 

4.10.10 For /= 1 Eq. 4.260 becomes 

I7(0>')= I D^Ja,fi,y)Yr«K<p). 

m'= — 1 

Rewrite these spherical harmonics in cartesian form. Using D 1 from Eq. 4.259, 
show that the resulting cartesian coordinate equations are equivalent to the 
Euler rotation matrix A(oc,p,y), Eq. 4.80, rotating the coordinates. 

4.10.11 (a) Assuming that D y (a, P , y) is unitary, show that 

t 17(02,%) 

m= —l 

is a scalar quantity (invariant under rotations). This is a function analog 
of a scalar product of vectors. 
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(b) From part (a) derive the spherical harmonic addition theorem, Eq. 12.224: 
P,( cosy) = 471(2/ + I)' 1 t >7'(0i , <Pi) Yr(0 2 ,<p 2 ). 

m= -l 

Hint. Set 6 1 = 0 (which makes 0 2 = y) and quote Exercise 12.6.2. 


4.11 GENERATORS 

Rotations and Angular Momentum 

From Section 4.3 we have matrix representations of the rotation of a co- 
ordinate system and the rotation of a vector. From Section 4. 10 we have matrix 
representations of the rotation of functions. In all these cases rotations about 
a common axis combine as 

Rz(</> 2 )- R zOl) = R z(</>1 + <P2 )• 

Multiplication of these matrices is equivalent to addition of the arguments. 
This suggests that we look for an exponential representation of our rotations : 
e xp(<Pi)-exp((p 2 ) = exp(<pj + (p 2 ). 

From Exercise 4.5.12 we take two matrices U and H related by 

U = = 1 + iaH + (/«H) 2 /2! + • • • . (4.261) 

Here a is a real parameter independent of H. The Maclaurin expansion of the 
exponential serves to define the exponential. Further, from Exercise 4.5.12, 
if H is Hermitian, then U is unitary. Similarly, if U is unitary, H is Hermitian. 

Now, in the context of group theory, H is labeled a generator , j the generator 
of U. The relation of the generator to the rotation group O3 is indicated 
schematically in Fig. 4.14. 

1 . Starting with the left side of Fig. 4. 14, the matrix describing a finite rotation 
of the coordinates through an angle <p counterclockwise about the z-axis is given 
by Eq. 4.44 as 

( cos tp sin cp 0\ 

— sin cp cos cp 0 J. (4.262) 

0 0 1/ 

2. Let the rotation described FT be an infinitesimal rotation through an 
angle Sep . Then FT may be written as 

R z (5<p) = 1 +iScp M z , (4.263) 

where 

(0 0\ 

M, = l 1 0 0 ). (4.264) 

\0 0 0 / 


'The use of the term generator here for continuous groups is completely 
different from the use of this term for finite groups (compare Exercise 4,9.7). 
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FIG. 4.14 Group-generator relationships 


M z and the corresponding matrices M x and M v appear in Exercise 4.2. 16 where 
they are shown to satisfy particular commutation relations (as in Exercise 1.8.8). 
In Section 12.7 we will show that this identifies the M matrices as an angular 
momentum representation. M. may also be obtained by differentiation. If we 
interpret the derivative of a matrix as the matrix of the derivatives, then 

dnjd<p\ 9 = 0 = iM s . (4.265) 

From this point of view, Eq. 4.263 is a Maclaurin expansion of R_ with terms 
of order (Sep) 2 and beyond omitted. The validity of Eq. 4.265 is a consequence 
of the differentiability of Lie groups. 

3. Our finite rotation cp may be compounded of successive infinitesimal 
rotations d(p. 


R z (S<p 2 + dtp 0 = (1 + iScp 2 IVL)(1 + idcp x IVL). (4.266) 

Let Sep = (p/N for N rotations, with N -> oo. Then, 

R ^(p) — hm [1 4 - (i(p/N) IVL]^ = exp (icp IVL). (4.267) 

N~+ao 

From this form we identity M z as the generator of the group R -(</>), a subgroup 
of O 3 . The actual reconstruction of R z (cp) appears subsequently. Two charac- 
teristics are worth noting: 

a. M z is Hermitian and R z {cp) is unitary. 

b. Trace( M z ) = 0 and det R z (c p) = + 1 . 

In direct analogy with M z , M x may be identified as the generator of R x , 
the (sub) group of rotations about the x-axis. And then generates R v . 

4. As indicated in Eq. 4.261, the exponential may be expanded to give 
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exp(7<pM 2 ) = 1 + i(p M, + (i<p M,) 2 /2! + (/<pl\/l 2 ) 3 /3! + • 



1 0 0 > 

0 1 0 ]{1 - cp 2 /2\ + <p 4 /4! 

0 0 0 , 


(4.268) 


+ iM z {(p-(p 3 /3\ + <p s /51 }. 

In the second preceding equality the relations 

( l o o\ 

M z 2 = I 0 1 0 ) and M \ = M z (4.269) 

\0 0 0 / 

have been used. Recognizing that the first series is cos cp and the second sin<p, 
we have R z ((p) as given in Eq. 4.262. 

5. Returning to the infinitesimal level, our infinitesimal rotations commute: 


[R x (d(p x ), RyOfyj,)] = [R y (^j,), R z (<5 <p 2 )] 

= [ Rz(<5<Pz)> R x (<5<p x )] = 0 
and an infinitesimal rotation about an axis defined by a unit vector n becomes 
R ( S(p ) = 1 4- i(S(p x M x + 3(p y Mj, + S<p z M z ) 

= 1+ idcpxi* M. 


(4.270) 


(4.271) 


6. From Exercise 4.2.16 the generators satisfy the commutation relations 

[ = ie ijk M k (4.272) 

characteristic of angular momentum, Exercise 1.8.8. Here e ijk is the totally 
antisymmetric Levi-Civita symbol of Section 3.4. A summation over k is 
implied, but there is only one nonvanishing term. The coefficient of M k , ie ijk , 
is called the structure constant. The structure constants form the starting point 
for the development of a Lie algebra. As previously seen, the group generators 
determine the structure constants. Conversely, it may be shown that the 
structure constants determine the group. 

The result of this manipulation is that 

R *(<?) = exp {iq> M 2 ). 


R z (cp) describes a rotation of the coordinate system about the z-axis and M z 
is identified as an angular momentum matrix. The sign in the exponent is 
positive since we have rotated the coordinate system; rotation of a vector 
relative to a fixed coordinate system would be described by 

R z(<P) = exp( — icp M,). 

It might be noted that Eq. 4.272 has an infinite number of solutions. The three 
matrices M x , M y , M z of Exercise 4.2.15 constitute one solution — correspond- 
ing to one unit of angular momentum. Other solutions, (2/ 4- 1) x (21 4- 1) 



264 DETERMINANTS, MATRICES, AND GROUP THEORY 


matrices, with / = 2, 3, 4, ... generate the other irreducible representations of 
the rotation group, Of 

Rotation of Functions 

In all the foregoing discussion the matrices rotate the coordinates. Any 
physical system being described is held fixed. Now let us hold the coordinates 
fixed and rotate a function ip(x,y,z) relative to our fixed coordinates. With R 
to rotate the coordinates, we introduce an operator 01 to rotate functions. 
We define 01 by 

m\if{x,y,z) = \p'(x,y,z) \p(x') (4.273) 

with 

x = Rx. (4.274) 

In words, 01 operates on the function \p, rotating ip and creating a new function 
xp\ This new function \p' is numerically equal to ip(x'), where x' indicates that 
the coordinates have been rotated by R. For the special case of a rotation about 
the z-axis 

0t z {cf)\\ , {x 9 y 9 z) = \p(xco$(p + jsin<p), — x sirup + y cos cp,z). (4.275) 

To get some understanding of the meaning of Eq. 4.275 consider the case 
c p — n/2. Then 

@ z ip(x,y 9 z) = \p(y, -x,z). (4.276) 



FIG. 4. 15 Rotation of a function ip(x,y, z) 

The function \p may represent a wavefunction or some classical physical system. 

Imagine that ij/(x,y,z) is large when its first argument is large. Then 
0t z ((p — n/2)\l/(x 9 y 9 z) will be large when the first argument of ip(y, — x,z) is 
large, that is, when y is large. This is pictured in Fig. 4. 15. The effect , then , of 
is to rotate the pattern of the function ip counterclockwise — the same as R would 
rotate the coordinate system. 

Returning to Eq. 4.275, consider an infinitesimal rotation again, (p -+ dcp. 
Then, using R z , Eq. 4.262, we obtain 
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g% z {Sep)\jj{x 9 y 9 z) — {j/(x + ySep 9 y — xSep,z). 

The right side may be expanded as a Taylor series (Section 5.6) to give 
^l z (Sep)\j/{x 9 y 9 z) = *l/(x,y,z) - Sep{xdil//dy - ydip/dx} + 0{Sep) 2 
= (1 - iSepL z )ij/(x 9 y 9 z) 9 


(4.277) 


(4.278) 


the differential expression in curly brackets being iL z9 Exercise 1.8.7 again. 
Since a rotation of first ep and then Sep about the z-axis is given by 


M z ((p + S(p)\jj = @ z (dep)<M z (ep)ijj = (1 - iScpL z )0l z {ip)\jj 9 (4.279) 

we have (as an operator equation) 

(#,(<? + Sep) - M z (ep))/Sep = - iL z @ z (ep). (4.280) 

The left side is just d£% z (ep)/dep (for <S<p -> 0). In this form Eq. 4.280 integrates 
immediately to 


»,(?) = exp(-i<pZJ, (4.281) 

Note carefully that M z {ep) rotates functions (counterclockwise) relative to fixed 
coordinates and that L z is our angular momentum operator. The constant of 
integration is fixed by the boundary condition ^ z (0) = 1. 

Note the resemblance to Eq. 4.267 and the differences. R_ rotates the 
coordinates; 8% z rotates functions. M z is a matrix, L z a differential operator. 
Note also that L X9 L y , and L z satisfy exactly the same commutation relation 
as M*, M y , and M z , 

[A > Lj] = i £ ij k Lk (4.282) 

and yield the same structure constants. 

Equations 4.281 and 4.267 might also be compared with two equations in 
Section 4.3: Eq. 4.89, in which A rotates coordinates counterclockwise, and 
Eq. 4.93, in which the same A rotates a vector clockwise. Here we have R 
rotating coordinates counterclockwise and $ rotating functions counterclock- 
wise. This is a consequence of the negative exponential in Eq. 4.281. 


SU(2) and the Pauli Matrices 

The elements (U x , Uy, U z ) of the two-dimensional unitary group, SU(2), may 
be generated by 

QX P(iiacr i) 9 exp (jiba 2 ) and exp(|/ca 3 ), (4.283) 

where a i9 a 2 , and <r 3 are the three Pauli spin matrices. The three parameters 
a , b , and c are real. Again, note that the a’s are Hermitian and have zero trace. 
The elements of SU(2), Eq. 4.283, are unitary and have a determinant of + 1. 
It might be noted that the generators in diagonal form such as a 3 lead to 
conserved quantum numbers. 

The Pauli a’s satisfy commutation relations 

&j] ~ ZiSijkGk- 


(4.284) 



266 DETERMINANTS, MATRICES, AND GROUP THEORY 


This differs from the L and M commutation relations, Eqs. 4.272 and 4.282 
by a factor of 2. Let us therefore define s f = i = 1, 2, 3. Then 

[s f , s,J = ie ijk s k (4.285) 

exactly like the angular momentum commutation relations, 2 Eqs. 4.272 and 
4.282, showing that the s f , not a h are the diigular momentum operators. This 
is the reason for including the j’s in the generator exponentials. Essentially 
this is the same as the adoption of the half-angles in the investigation of the 
SU(2)-0 3 homomorphism in Section 4.10. exp(^zccr 3 ) = exp(/cs 3 ) = IL is the 
2x2 analog of Eq. 4.267. U z is the rotation matrix and s 3 = cr 3 /2 is the 
corresponding angular momentum matrix. 

Equation 4.66 gives the rotation operator for rotating the coordinates in 
the three-space. Using the angular momentum matrix s 2 , we have as the 
corresponding coordinate rotation operator in two-dimensional (complex) 
space 


= exp(i(ps z ) = exp(/<p<r 2 /2). 

For rotating the two-component column vector wave function (spinor) of a 
spin j particle relative to fixed coordinates, the rotation operator is 

R z ((p) = exp( — /(<p/2)cr z ). 

Expanding exp(/as 1 ) = exp (jiaa t ) as a Maclaurin series, we obtain 

exp (itorO = 1 (1 — (a/2) 2 /2l + (a/2) 4 /4! } 

+ /a, {(a/2) - (a/2) 3 /3 ! + (a/2) 5 /5 ! } 2g6) 

/.cos a/2 zsina/2 v 
\i sin a/2 cos a/2 

= 1 costf/2 H- ia x sin a/2, 


a special case of Eq. 4.248. The parameter a appears as an angle, the coefficient 
of an angular momentum matrix — like cp in Eq. 4.267. But in SU(2) form the 
angle always appears as a half-angle, a/2. Similarly (completing Eq. 4.248), 


exp (&iba 2 ) 
exp (%ica 3 ) 


cos^/2 
— sin b/2 


sinb/1 
cos b/ 2 


exp \ic 
0 


0 


exp — pc 


= 1 cos 6/2 -f /cr 2 sin6/2 
— 1 cos c/2 + /cr 3 sin c/2. 


(4.287) 


With this identification of the exponentials, the general form of the SU(2) 
matrix may be written as 


U(a, 0, y) = exp&ya 3 )exp&po 2 )exp(±iuo } ). (4.288) 


2 These structure constants (is ijk ) lead to the SU(2) representations of dimen- 
sion 2j -f 1 for generators of dimension 2j + \J = 0, T 1, • ■ • « The integral 

j cases also lead to the representations of Oj, as discussed in Section 4.10. 
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This reproduces Eq. 4.252 of Section 4.10. With D(<x,P,y) = U'(a,p,y), 

D 1/2 (a ,P,y) = exp(— ^('acr 3 )exp(— j//Ia 2 )exp(— 5 ^( 73 ) 

and leads to Eq. 4.258. The selection of the Pauli matrices corresponds to the 
Euler angle rotations described in Sections 4.3 and 4.10. 

Further examples of the infinitesimal rotation — exponentiation generator 
technique — appear in Section 4.13. 


EXERCISES 

4.1 1 .1 A translation operator T(a ) converts i//(x) to ij/(x + a ), 

T(a)ij/(x) = i//(x + a). 

In terms of the (quantum mechanical) linear momentum operator p x = — id/dx , 
show that 


T(a) = exp (iap x ). 

Hint. Expand i//(x -f- a) as a Taylor series. 

4.11.2 Consider the general SU(2) element Eq. 4.225 to be built up of three Euler 
rotations: (i) a rotation of a/2 about the z-axis, (ii) a rotation of b/2 about the 
new x-axis, and (iii) a rotation of c/2 about the new z-axis. (All rotations counter- 
clockwise.) Using the Pauli a generators, show that these rotation angles are 
determined by 

a = £, — £ + k/2 — <x + n/2 
b = 2rj = fi 

c = £ + £-7i/2 = y- n/2. 

Note. The angles a and b here are not the a and b of Eq. 4.224. 

4.11.3 The angular momentum-exponential form of the Euler angle rotation operators is 

01 = @ 2 "(y)0? y .(P)0? z (<x) 

= exp (—iyJ z -) exp(— i$J y .) exp(- M z ). 

Show that in terms of the original axes : 

01 = exp( — iaJ z ) exp( — ij$J y ) exp( — iyJ z ) 

Hint. The 0 operators transform as matrices. The rotation about the /-axis 
(second Euler rotation) may be referred to the original y-axis by 

exp( — ifij y ) = exp( - ioJ z ) exp( - ifij y ) exp(ioJ z ). 


4.12 SU(2), SU(3), AND NUCLEAR PARTICLES 

The application of group theory to “elementary” particles has been labeled 
by Wigner the third stage of group theory and physics. The first stage was 
the search for the 32 point groups and the 230 space groups giving crystal 
symmetries — Section 4.9. The second stage was a search for representations 
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such as the representations of O 3 and SU(2) — Section 4.10. Now in this third 
stage, physicists are back to a search for groups. 

In discussing the strongly interacting particles of high energy physics and the 
special unitary groups SU(2) and SU(3), we should look to angular momentum 
and the rotation group O 3 for an analogy. Suppose we have an electron in 
the spherically symmetric attractive potential of some atomic nucleus. The 
electron’s Schrodinger wavefunction may be characterized by three quantum 
numbers n, /, and m. The energy, however, is 2/ -b 1-fold degenerate, depending 
only on n and l l . The reason for this degeneracy may be stated in two equivalent 
ways : 

1. The potential is spherically symmetric, independent 
of 0 and <p, and 

2. The Schrodinger Hamiltonian ~(h 2 /2m e )\ 2 + V(r) 
is invariant under ordinary spacial rotations (Oj). 

As a consequence of the spherical symmetry of the potential, the angular 
momentum L is conserved. In Section 4.1 1 the cartesian components of L are 
identified as the generators of the rotation group Oj. Instead of representing 
L x , L y , and L z by operators, let us use matrices. The exercises at the end of 
Section 4.2 provide examples for / = j, 1, and §. The L { matrices are (21 + 1 ) x 
(21 -b 1 ) matrices with the dimension the same as the number of the degenerate 
states . 2 These L } matrices generate the (2/ -b 1) x (2/ + 1) irreducible represen- 
tations of O 3 . The dimension 21 + 1 is identified with the 21 -b 1 degenerate 
states. 

The common method of eliminating this degeneracy is to introduce a 
constant magnetic induction B. This leads to the Zeeman effect. This magnetic 
induction adds a term to the Schrodinger Hamiltonian that is not invariant 
under O 3 . This is a symmetry-breaking term. 

So much for the analogy. In the case of the strongly interacting particles 
(neutrons, protons, etc.) we cannot follow the analogy directly, because we 
do not yet fully understand the nuclear interaction. We do not know the 
Hamiltonian. So instead, let us run the analogy backward. 

In the 1930s Heisenberg proposed that nuclear forces were charge-indepen- 
dent, that the only two massive particles (baryons) known then, the neutron 
and proton, were two different states of the same particle. Table 4.2 shows that 
they have almost the same mass. The fractional difference, (m n — m p )/m p « 
0.0014, is small, suggesting that the mass difference is produced by a small 
charge -dependent perturbation. It was convenient to describe this near degen- 
eracy by introducing a quantity I with z-projections I 3 = \ for the proton, 
— \ for the neutron. The name coined for I was isospin. Isospin had nothing 
to do with spin (the particle’s intrinsic angular momentum) but the two- 


Uf the potential is a pure Coulomb potential, the energy depends only on n 
(see Section 13.2). 

2 With L t a matrix, the Schrodinger wavefunction ij/(r, 6, (p) is replaced by a 
state vector — with 2/ + 1 components. Angular momentum and the (21 + 1)- 
fold degeneracy are discussed at some length in Section 12.7. 
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TABLE 4.3 


Baryons with Spin 

j Even Parity 





Mass (MeV) 

Y 

/ 

h 

S~ 

1321.300 



1 

2 

s 


-1 

1 

2 


77 0 

1314.900 



+i 

2T 

1197.410 



-1 

E s° 

1192.540 

0 

1 

0 


1189.470 



+ 1 

> 

> 

1115.500 

0 

0 

0 

n 

939.550 



1 

“ 2 

N 


1 

1 

2 


P 

938.256 



+ 2 


component isospin state vector obeyed the same mathematical relations as the 
spin J state vector, and in particular could be taken to be an eigenvector 
of the Pauli <7 3 matrix. 

In the absence of charge-dependent forces, isospin is conserved (the proton 
and neutron have the same mass) and we have a twofold degeneracy. Equiva- 
lently, the unknown nuclear Hamiltonian must be invariant under the group 
generated by the isospin matrices. The isospin matrices are just the three Pauli 
matrices (2x2 matrices), and the group generated is the SU(2) group of 
Section 4.10, also 2x2 corresponding to our twofold degeneracy. 

By 1961 many more particles had been discovered (or created). The eight 
shown in Table 4.3 attracted particular attention. 3 It was convenient to describe 
them by characteristic quantum numbers, I for isospin, and Y for hypercharge. 
The particles may be grouped into charge or isospin multiplets. Then the 
hypercharge Y may be taken as twice the average charge of the multiplet. For 
the neutron-proton multiplet 

7= 2 -|(0 + 1)= 1. (4.289) 

The hypercharge and isospin values are listed in Table 4.3. 

From scattering and production experiments it had become clear that both 
hypercharge Y and isospin / were conserved under stong (nuclear) interaction. 
Remember L (or /) is conserved under a spherically symmetric Hamiltonian. 
The eight particles thus appeared as an eightfold degeneracy, but now with two 
quantities to be conserved. In 1961 Gell-Mann, and independently Ne’eman, 
suggested that the strong interaction should be invariant under the three- 
dimensional special unitary group, SU(3), that is, should have SU(3) symmetry. 

The choice of SU(3) was based first on the existence of two conserved 
quantities. This dictated a group of rank 2, a group, two of whose generators 


All masses are given in energy units, MeV. 
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(and only two) commuted. Second, the group had to have an 8 x 8 representa- 
tion to account for the eight degenerate baryons. In a sense SU(3) was the 
simplest generalization of SU(2). Gell-Mann set up eight generators: three for 
the components of isospin, one for hypercharge, and four additional ones. All 
are 3x3, zero-trace matrices. As with O 3 and SU(2), there are an infinity of 
irreducible representations. An eight -dimensional one was associated with the 
eight particles of Table 4. 3 . 4 

We imagine the Hamiltonian for our eight baryons to be composed of three 
parts 

H -^strong ^medium ^electromagnetic* (4.290) 

The first part, H sivo ng , possesses the SU(3) symmetry and leads to the eightfold 
degeneracy. Introduction of a symmetry breaking interaction, // medium , removes 
part of the degeneracy giving the four isospin multiples S, X, A, and N. These 
are multiplets because 7/ medium still possesses SU(2) symmetry. Finally, the 
presence of charge-dependent forces splits the isospin multiplets and removes 
the last degeneracy. This imagined sequence is shown in Fig. 4.16. 

Applying first-order perturbation theory of quantum mechanics, simple 
relations among the baryon masses may be calculated. Also, intensity rules for 
decay and scattering processes may be obtained. 

Perhaps the most spectacular success of this SU(3) model has been its 
prediction of new particles. In 1961 four K and three n mesons (all pseudoscalar ; 
spin 0, odd parity) suggested another octet, similar to the baryon octet. The 
SU(3) theory predicted an eighth meson rj°, mass 563 MeV. The rj° meson, 
experimentally determined mass 548 MeV, was found soon after. Groupings 
of nine of the heavier baryons (all with spin f, even parity) suggested a 10 - 
member group or decuplet. The missing tenth baryon was predicted to have a 
mass of about 1680 MeV and a negative charge. In 1964 the negatively charged 
Q~, mass 1675 ± 12 MeV, was discovered. 

Since the completion of this f + decuplet, a (odd parity) multiplet for 
baryons and 1 " and 2 + multiplets for mesons have been established. 

The application of group theory to strongly interacting particles has been 
extended beyond SU(3). There has been an extensive investigation of SU( 6 ) 
and of the more complex, higher-dimensional groups. Great attention has been 
paid to the group generators and to the structure constants in the generator 
commutation relations (such as ie ijk for orbital angular momentum). These 
structure constants define a Lie algebra. It is possible to associate space integrals 
of current densities with the group generators. This leads to a current algebra 
far beyond the scope of this discussion. 

To keep group theory and its very real accomplishment in proper perspective, 
we should emphasize that group theory identifies and formalizes symmetries. 


4 This application of SU(3) has been called by Gell-Mann the “eightfold 
way.” Note the eight independent parameters of SU(3) (from n 2 — 1), the 
eight generators, the 8x8 representation associated with eight particles. The 
name also refers to the Eightfold Way of Buddha. 
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FIG. 4.16 Baryon mass splitting 


It classifies (and sometimes predicts) particles. But aside from saying that one 
part of the Hamiltonian has SU(2) symmetry and another part has SU(3) 
symmetry, group theory says nothing about the particle interactions. Remember 
that the statement that the atomic potential is spherically symmetric tells us 
nothing about the radial dependence of the potential or of the wavefunction. 

4.13 HOMOGENEOUS LORENTZ GROUP 

Generalizing the approach to vectors of Section 1 .2, scientists demand that 
our physical laws be co variant 1 under 

a. space and time translations, 

b. rotations in real, three-dimensional space, and 

c. Lorentz transformations. 

The demand for covariance under translations is based on the homogeneity of 
space and time. Covariance under rotations is an assertion of the isotropy of 
space. The requirement of Lorentz covariance is based on acceptance of special 
relativity. All three of these transformations together form the inhomogeneous 
Lorentz group or the Poincare group. Here we exclude translations. The 
space rotations and the Lorentz transformations together form a group — the 
homogeneous Lorentz group. 

We first generate a subgroup, the Lorentz transformations in which the 


1 To be covariant means to have the same form in different coordinate systems 
so that there is no preferred reference system (compare Sections 1.2 and 3.1). 
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relative velocity v is along the x = x x axis. The generator may be determined 
by considering Lorentz space-time reference frames moving with a relative 
velocity 5v, an infinitesimal. 2 The relations are similar to those for rotations 
in real space, Sections 1.2, 3.1, and 4.3, except that here the angle of rotation 
is pure imaginary (compare Section 3.7). 

We work in Minkowski space with x 4 = let. For an infinitesimal relative 
velocity Sv the space -time transformation is Galilean : 

x\ = Xj — Svt = x x + iS^x 4 . (4.291) 

Here, as usual, ft = v/c. By symmetry we also write 

x 4 = x 4 + iaSPx l (4.292) 

with a a parameter that is fixed by the requirement that x\ 4- x 4 be invariant, 

x\ 2 4* x' 4 = x\ 4- x\* (4.293) 

Remember x^ is the prototype four-dimensional vector in Minkowski space. 
Thus Eq. 4.293 is simply a statement of the invariance of the square of the 
magnitude of the “distance” vector under rotation in Minkowski space. Here 
is where the special relativity is brought into our transformation. Squaring 
and adding Eqs. 4.291 and 4.292 and discarding terms of order (<)/?) 3 , we find 
a — — 1. Equations 4.291 and 4.292 may be combined as a matrix equation 



= (1 + SfioJ 



(4.294) 


a 2 happens to be the negative of the Pauli matrix, o y . 

The parameter S(3 represents an infinitesimal change. Using the same 
techniques as in Section 4.1 1, we repeat the transformation N times to develop 
a finite transformation with the velocity parameter 0 = N3p. Then 



In the limit as N -* oo , 

ii m .( i+ ^T =exptoi ' 


(4.295) 


(4.296) 


As in Section 4.1 1, the exponential is interpreted by a Maclaurin expansion 


exp 6o 2 = 1 + 0o 2 + (6o 2 ) 2 /2 ! + (0a 2 ) 3 / 3 ! + • • • . (4.297) 

Noting that a 1 2 = 1 , 


exp 0o 2 = 1 cosh 9 + a 2 sinh 0. (4.298) 

Hence our finite Lorentz transformation is 


2 This derivation, with a slightly different metric, appears in an article by 

Strecker, J. L., Am. J. Phys. 35, 12 (1967). 
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x{\ / cosh# zsinhtA/x! 

x 4 J \—i sinh 8 cosh 8 ) \x 4 

o 2 has generated the representations of this special Lorentz transformation. 

Cosh 8 and sinh 8 may be identified by considering the origin of the primed 
coordinate system, x\ = 0, or x x = vt. Substituting into Eq. 4.299, we have 

0 = x 1 cosh 0 + x 4 i sinh 6. (4.300) 

With x x = vt and x 4 = ict. 

tanh 0 = — vjc 

Note that 0 =£ v/c except in the limit as v -> 0. 

Using 1 — tanh 2 0 = (cosh 2 0)~ \ 

cosh 6 = (1 — jS 2 )“ 1/2 = y, sinh 0 = fiy. (4.301) 

The matrix in Eq. 4.299 agrees with the x 3 — x 4 portion of the matrix in Eq. 
3.120. 

The preceding special case of the velocity parallel to one space axis is easy, 
but it illustrates the infinitesimal velocity — exponentiation — generator tech- 
nique. Now we apply this exact technique to derive the Lorentz transformation 
for the relative velocity v not parallel to any space axis. 

Let v x — 2|v|, v 2 = /z]v|, and v 3 = v|v| with 2, ji , and v the direction cosines 
of v. In analogy to Eq. 4.291 we write 

x \ = + /2<5/?x 4 

x' 2 = x 2 + igSpx 4 (4.302) 

x 3 = x 3 + ivSpx 4 . 

Again, by symmetry we try 

x 4 = x 4 + ia^dfiXi ia 2 dpx 2 -f ia 3 d/3x 3 . (4.303) 

From 



4 4 

I X? = I xl 

$-1 c=l 

a x = — 2, a 2 — — g, and a 3 — — v. 
Rewriting Eqs. 4.302 and 4.303 as a matrix equation, we have 

/x'i\ /I 0 0 /2b/f\ XxX 

/ X 2 \ / 0 1 0 ijudp \ I x 2 \ 

l x 3 ) l 0 0 1 ivdp II x 3 I 

Nx 4 y ikdP — ijadp —ivdp 1 J \x 4 / 


(4.304) 


(4.305) 


Subtracting out 1 and removing Sp as a factor, we obtain 

x' — (1 4- Spa)x. 


(4.306) 



274 DETERMINANTS, MATRICES, AND GROUP THEORY 


Here 



By direct multiplication (with X 2 + /x 2 4- v 2 = 1), 


^X 2 Xfl /v 0 N 

A/x ji 2 /xv 0 

Xv /xv v 2 

.0 0 0 L 


(4.307) 


(4.308) 


and 


(4.309) 


As before, we iterate N times with 6 — NSp. Forming the exponential 
lim (1 + Oa/Nf = e 0a 

(4.310) 

= 1-1- crsinhfl + cr 2 (cosh0 — 1). 


o is our generator with the parameters X, /x, and v defining the direction of 
the velocity built in. Writing out the second part of Eq. 4.310, the Lorentz 
transformation matrix in all its glory is 


L(v) = 

/I + A 2 (cosh0 — 1) 
/ A/x(cosh0 — 1) 
l Av(cosh0 — 1) 

V — iX sinh 6 


A/x(cosh0 — 1) 

1 4- /x 2 (cosh0 — 1) 
/xv (cosh 0 — 1) 

— //x sinh 0 


Av(cosh0 — 1) 
jUv(cosh 0—1) 

1 -h v 2 (cosh0 — 1) 
— iv sinh 6 


iX sinh 0\ 
//xsinh0 \ 
iv sinh 0 I 
cosh 6 / 
(4.311) 


Again, cosh 6 = (1 — fi 2 ) 1/2 = y, sinh 6 = py. 

It is worth noting that the combination of Eqs. 4.310 and 4.31 1, 

L(v) = (4.312) 

is not in the exact form of Eq. 4.261. The exponent lacks the factor x, and L(v) 
is not unitary. 

The matrices given by Eq. 4.299 for the case of v = ir x form a subgroup. 
The matrices of Eq. 4.31 1 do not. The product of two Lorentz transformation 
matrices, L(v x ) and L(v 2 ), yields a third Lorentz matrix L(v 3 ) — if the two 
velocities v 1 and v 2 are parallel. The resultant velocity v 3 is related to Vj and v 2 
by the Einstein velocity addition law, Section 3.7. If and v 2 are not parallel, 
no such simple relation exists. Specifically, consider three reference frames S , 
S\ and S ", with S and S' related by L(Vj), and S' and S" related by L(v 2 ). 
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If the velocity of S" relative to the original system S is v 3 , S" is not obtained 
from S by L(v 3 ) = L(v 2 ) L^). Rather, we find that 

L(v 3 ) = RL(v 2 )L(v 1 ) (4.313) 

where R is a 3 x 3 space rotation matrix embedded in our four-dimensional 
space-time. With and v 2 not parallel, the final system S" is rotated relative 
to S. This rotation is the origin of the Thomas precession involved in spin-orbit 
coupling terms in atomic and nuclear physics. Because of its presence, the L(v) 
by themselves do not form a group. 


EXERCISES 


4. 1 3. 1 Obtain <r(A, p, v) by differentiating the final matrix, Eq. 4.311. 

4.1 3.2 Two Lorentz transformations are carried out in succession: v t along the x-axis, 
then v 2 along the y-axis. Show that the resultant transformation (given by the 
product of these two successive transformations) cannot be put in the form of 
Eq. 4.311. 

Note. The discrepancy corresponds to a rotation. 

4.1 3.3 Rederive the Lorentz transformation working entirely in the real space (x 0 ,Xi , 
x 2 , x 3 ) with x 0 — ct. Show that the Lorentz transformation may again be written 
L(v) = exp(0cr), Eq. 4.312, but now with 

-2 -p v\ 

0 0 0 1 

0 0 0 /' 

0 0 0 / 

4.13.4 Using the matrix relation, Eq. 4.299, let the velocity parameter 9 { relate the 
Lorentz reference frames (x\,x' 4 ) and (x l ,x 4 ). Let 0 2 relate (x'^xl) and (x^x'J. 
Finally, let 6 relate (x'^xl) and (x l5 x 4 ). From 9 = 9 X + 0 2 derive the Einstein 
velocity addition law 
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5 INFINITE SERIES 


5.1 FUNDAMENTAL CONCEPTS 

Infinite series, literally summations of an infinite number of terms, occur 
frequently in both pure and applied mathematics. They may be used by the 
pure mathematician to define functions as a fundamental approach to the theory 
of functions, as well as for calculating accurate values of transcendental constants 
and transcendental functions. In the mathematics of science and engineering 
infinite series are ubiquitous, for they appear in the evaluation of integrals 
(Section 5.6 and 5.7), in the solution of differential equations (Sections 8.5 and 
8.6), and as Fourier series (Chapter 14) and compete with integral representa- 
tions for the description of a host of special functions (Chapters 11, 12, and 13). 
In Section 16.3 the Neumann series solution for integral equations provides one 
more example of the occurrence and use of infinite series. 

Right at the start we face the problem of attaching meaning to the sum of an 
infinite number of terms. The usual approach is by partial sums. If we have a 
sequence of infinite terms u 1 ,u 2 ,u 3 ,u A ,u 5 , . . . , we define the/ th partial sum as 

Si = £ u„. (5.1) 

n = l 

This is a finite summation and offers no difficulties. If the partial sums s { converge 
to a (finite) limit as i -> oo, 

lim s t = S , (5.2) 

i~* oo 

the infinite series u n is said to be convergent and to have the value S . Note 
carefully that we reasonably, plausibly, but still arbitrarily define the infinite 
series as equal to S. The reader should also note that a necessary condition for 
this convergence to a limit is that linv*^ u n = 0. This condition, however, is not 
sufficient to guarantee convergence. Equation 5.2 is usually written in formal 
mathematical notation: 


The condition for the existence of a limit S is that for each c > 0, 
there is a fixed N such that 

\S — Sf| < a, for i > N. 

This condition is often derived from the Cauchy criterion applied to the partial 
sums s t . The Cauchy criterion is: 


A necessary and sufficient condition that a sequence (s,) converge 

277 
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is that for each £ > 0 there is a fixed number N such that 


| Sj — s t -| < £ for all i,j > N. 

This means that the individual partial sums must cluster together as 
we move far out in the sequence. 

The Cauchy criterion may easily be extended to sequences of functions. We see 
it in this form in Section 5.5 in the definition of uniform convergence and in 
Section 9.4 in the development of Hilbert space. 

Our partial sums s t may not converge to a single limit but may oscillate, 
as in the case 


Yj U n~ 1 — 1 + 1 — 1 + 1+ — ( — 1)" + *•'. (5.3) 

rt = l 

Clearly, s { — 1 for i odd but 0 for i even. There is no convergence to a limit, 
and series such as this one are labeled oscillatory. 

For the series 


we have 


_ n(n 4- 1) 


(5.4) 

(5.5) 


As n -* oo, 

lim s„ = oo. (5.6) 

ft-* CO 

Whenever the sequence of partial sums diverges (approaches ± oo), the infinite 
series is said to diverge. Often the term divergent is extended to include oscil- 
latory series as well. 

Because we evaluate the partial sums by ordinary arithmetic, the convergent 
series, defined in terms of a limit of the partial sums, assume a position of 
supreme importance. Two examples may clarify the nature of convergence or 
divergence of a series and will also serve as a basis for a further detailed investiga- 
tion in the next section. 


EXAMPLE 5.1.1 The Geometric Series 

The geometrical sequence, starting with a and with a ratio r (r > 0), is given 
by 

a, ar , ar 2 , ar 3 , . . . , ar"" 1 , .... 

The nth partial sum is given by 1 


1 Multiply and divide s„ ~ Jo artn by 1 — r. 
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Taking the limit as n -> oo, 

lim s n = — , for r < 1. (5.8) 

n-*oo 1 — r 

Hence, by definition, the infinite geometric series converges for r < 1 and is 
given by 

£ or"- 1 = (5.9) 

n==l 1 - r 

On the other hand, if r > 1, the necessary condition u n -► 0 is not satisfied and 
the infinite series diverges. 

EXAMPLE 5.1.2 The Harmonic Series 


As a second and more involved example, we consider the harmonic series 

= + i + (5.10) 

We have the lim,,^ u n — lim,,^ 1 jn = 0, but this is not sufficient to guarantee 
convergence. If we group the terms (no change in order) as 

i + i + (i + i) + (i + i + * + i) + (i+ ••• +*)+ •••, (5.ii) 


it will be seen that each pair of parentheses encloses p terms of the form 

1 


1 + -4 + 


+ -T- ->f-i 

p + p 2 p 2 


(5.12) 


p 1 p + 2 

Forming partial sums by adding the parenthetical groups one by one, we obtain 


51 — 1, 

3 

5 2 — 


Sa > 2’ 


6 




S„ > 


2’ 

n + 1 


(5.13) 


The harmonic series considered in this way is certainly divergent. 2 An 
alternate and independent demonstration of its divergence appears in Section 
5.2. 

Using the binomial theorem 3 (Section 5.6), we may expand the function 

(1 + *)- 1 : 


2 The (finite) harmonic series appears in an interesting note on the maximum 
stable displacement of a stack of coins, Johnson, P. R., “The Leaning Tower 
of Lire.” Am. J. Phys. 23, 240 (1955). 

3 Actually Eq. 5.14 may be taken as an identity and verified by multiplying 
both sides by 1 -h x. 
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= 1 — x + x 2 — x 3 + • • * + ( — x) n 1 4- • * ■ . (5.14) 

1 + x 

If we let x -» 1, this series becomes 

1-1 + 1-1 + 1-1+'*, (5.15) 

a series that we labeled oscillatory earlier in this section. Although it does not 
converge in the usual sense, meaning can be attached to this series. Euler, for 
example, assigned a value of \ to this oscillatory sequence on the basis of the 
correspondence between this series and the well-defined function (1 -f x)~K 
Unfortunately, such correspondence between series and function is not unique 
and this approach must be refined. Other methods of assigning a meaning to a 
divergent or oscillatory series, methods of defining a sum, have been developed. 
In general, however, this aspect of infinite series is of relatively little interest to 
the scientist or the engineer. An exception to this statement, the very important 
asymptotic or semiconvergent series, is considered in Section 5.10. 


EXERCISES 


5 . 1.1 Show that 


00 1 

y i 

n % (2 n - l)(2n + 1) 


1 

2 ' 


Hint. Show (by mathematical induction) that s m — m/(2m + 1). 


5 . 1.2 Show that 

CO 1 

y — - — = i. 
h n(n Hh 1) 

Find the partial sum s m and verify its correctness by mathematical induction. 
Note. The method of expansion in partial fractions, Section 1 5.8, offers an alterna- 
tive way of solving Exercises 5.1.1 and 5.1.2. 


5.2 CONVERGENCE TESTS 

Although nonconvergent series may be useful in certain special cases, 
(compare Section 5.10), we usually insist, as a matter of convenience if not 
necessity, that our series be convergent. It therefore becomes a matter of extreme 
importance to be able to tell whether a given series is convergent. We shall 
develop a number of possible tests, starting with the simple and relatively insen- 
sitive tests and working up to the more complicated but quite sensitive tests. 

For the present let us consider a series of positive terms, a n > 0, postponing 
negative terms until the next section. 

Comparison Test 

If term by term a series of terms u„ < a n , in which the a„ form a convergent 
series, the series u n is also convergent. Symbolically, we have 



CONVERGENCE TESTS 281 


= a x 4- a 2 + a 3 + • • • , convergent, 

n 

£u„ = Ml + U 2 + U 3 + ■■■. 
n 

If u n < a n for all n, then £ n u„ < and w„ therefore is convergent. 

If term by term a series of terms v„ > b n , in which the b n form a divergent 
series, the series is also divergent. Note that comparisons of u n with b n or 
v n with a n yield no information. Here we have 

Y^K = b x + b 2 4- b 3 4- • • • , divergent, 

n 

Y v „ = v 1 + v 2 + v 3 + ■■■. 

n 

If v n > b n for all n, then v n > b n and v n therefore is divergent. 

For the convergent series a n we already have the geometric series, whereas 
the harmonic series will serve as the divergent series b n . As other series are 
identified as either convergent or divergent, they may be used for the known 
series in this comparison test. 

All tests developed in this section are essentially comparison tests. Figure 5.1 
exhibits these tests and the interrelationships. 



FIG. 5. 1 Comparison tests 
EXAMPLE 5.2. 1 The p Series 

Test Yjn n ~ P > P = 0-999, for convergence. Since n~ 0 999 > n~\ and b n = n' 1 
forms the divergent harmonic series, the comparison test shows that £ n n~ 0 " 9 
is divergent. Generalizing, Yn n ~ P * s seen 1° be divergent for all p < 1. 

Cauchy Root Test 

If (aj 1/n < r < 1 for all sufficiently large n , with r independent of n, then 
ci n is convergent. If (a n ) 1Jn > 1 for all sufficiently large n , then a n is divergent. 
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The first part of this test is verified easily by raising (a n ) lln < r to the nth 
power. We get 

a n <r n < 1. 

Since r n is just the nth term in a convergent geometric series, a n is convergent 
by the comparison test. Conversely, if (a n ) iln > 1, then a n > 1 and the series 
must diverge. This root test is particularly useful in establishing the properties 
of power series (Section 5.7). 


D'Alembert or Cauchy Ratio Test 

If a n +Ja n < r < 1 for all sufficiently large n, and r is independent of n, then 
Y^n a n i s convergent. If a n+ Ja n > 1 for all sufficiently large n, then is 

divergent. 

Convergence is proved by direct comparison with the geometric series 
(1 -f r + r 2 -f • • *). In the second part a„ +1 > a„ and divergence should be 
reasonably obvious. Although not quite so sensitive as the Cauchy root test, 
this D’Alembert ratio test is one of the easiest to apply and is widely used. An 
alternate statement of the ratio test is in the form of a limit: 

If 


lim 

n~>o o 


a n + 1 


< 1 , 

convergence. 

> 1 , 

divergence, 

= 1 , 

indeterminant. 


(5.16) 


Because of this final indeterminant possibility, the ratio test is likely to fail 
at crucial points, and more delicate, more sensitive tests are necessary. 

The alert reader may wonder how this indeterminacy arose. Actually it was 
concealed in the first statement a n+1 /a n <r< 1. We might encounter a n+ Ja n < 1 
for all finite n but be unable to choose an r < 1 and independent of n such that 
a n+l /a n < r for all sufficiently large n. An example is provided by the harmonic 
series 


Since 


ga± i = n 
a n n + 1 


(5.17) 


lim = i. (5.18) 

oo Cl n 

no fixed ratio r < 1 exists and the ratio test fails. 

EXAMPLE 5.2.2 D’Alembert Ratio Test 

Test £,«/ 2" for convergence. 

a„ +1 (n + l)/2 fl+1 1 n + 1 

a n n/2 ” 2 n 


(5.19) 
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Since 


^ < - for n > 2, 

4 


we have convergence. Alternatively, 


and again — convergence. 


lim = l 


(5.20) 


(5.21) 


Cauchy or Maclaurin Integral Test 
This is another sort of comparison test in which we compare a series with an 
integral. Geometrically, we compare the area of a series of unit-width rectangles 
with the area under a curve. 




W (b) 

FIG. 5.2 (a) Comparison of integral and sum-blocks leading, (b) Comparison of 
integral and sum-blocks lagging 


Let f(x) be a continuous, monotonic decreasing function in which /(n) = a n . 
Then converges if f(x)dx is finite and diverges if the integral is infinite. 
For the iih partial sum 


But 


S: > 


1? 

If 

(5.22) 

fix) dx 

(5.23) 


by Fig. 5.2a, /(x) being monotonic decreasing. On the other hand, from Fig. 5.2 b y 

l f(x)dx , (5.24) 


*i 

% -a t < j 
J 1 


in which the series is represented by the inscribed rectangles. Taking the limit 
as i -► oo, we have 


f(x)dx < £ a„ < f(x)dx + a 1 . 


(5.25) 
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Hence the infinite series converges or diverges as the corresponding integral 
converges or diverges. 

This integral test is particularly useful in setting upper and lower bounds on 
the remainder of a series after some number of initial terms have been summed. 
That is, 

qo JV oo 

X X T X 

h = 1 n = l N + l 

where 

l^oo 00 /*oo 

/(*)<£* < X a n < f(x)dx + a N+v 
Jn + 1 n-N + l JjV + 1 


EXAMPLE 5.2.3 Riemann Zeta Function 


The Riemann zeta function is defined by 

f(p) = t 

n = 1 

We may take f(x) = x~ p and then 

x~ p dx = 

i -p + 1 

— lnxl®, 


p* i 

p = i. 


(5.26) 


(5.27) 


The integral and therefore the series are divergent for p < 1, convergent for 
p > 1. Hence Eq. 5.26 should carry the condition p > 1. This, incidentally, 
is an independent proof that the harmonic series (p — 1) diverges and diverges 
logarithmically. The sum of the first million terms ^ 1 ’ 000 ’ 000 n ~ 1 ? j s on jy 
14.392 726.... 

This integral comparison may also be used to set an upper limit to the 
Euler-Mascheroni constant 1 defined by 


y = lim ( V m 1 — Inn Y (5.28) 

"-* 00 \m=l ) 

Returning to partial sums, 

n C n dx 

s n = X m 1 — In n < Inn -h 1. (5.29) 

m=l Ji X 

Evaluating the integral on the right, s n < 1 for all n and therefore y < 1. Exercise 
5.2.12 leads to more restrictive bounds. Actually the Euler-Mascheroni constant 
is 0.577 215 66. ... 


x This is the notation of National Bureau of Standards, Handbook of Mathe- 
matical Functions. Applied Mathematics Se'ries-55 (A MS-55). 
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Kummer's Test 

This is the first of three tests that are somewhat more difficult to apply than 
the preceding tests. Their importance lies in their power and sensitivity. Fre- 
quently, at least one of the three will work when the simpler easier tests are 
indecisive. It must be remembered, however, that these tests, like those pre- 
viously discussed, are ultimately based on comparisons. It can be shown that 
there is no most slowly converging convergent series and no most slowly 
diverging divergent series. This means that all convergence tests given here, 
including Kummer’s, may fail sometime. 

We consider a series of positive terms u { and a sequence of finite positive 
constants a t . If 


> C > 0 


*n + l 


for all n > N, some fixed number, 2 then u t converges. If 


(5.30) 


“nTT - ~ *n + l ^ 0 


4 n + X 


(5.31) 


and a T 1 diverges, then 2 u t diverges. 

The proof of this powerful test is remarkably simple. From Eq. 5.30, with 
C some positive constant, 


Cu N+1 < a N u N — a N+\ u N+i 
Cu N+2 < a N+ iU N+l — a N+2 u N +2 


(5.32) 


Cu n <a n ^u n _ x — a n u n 
Adding and dividing by C, (C ^ 0), we obtain 


Y 1, <? a N U N __ Q n U n 
Zu i — f f m 

i~N + l C e 

Hence for the partial sum, s n , 

i = l ^ C 

c £ u t + a constant, independent of n. 

i = 1 ^ 


(5.33) 


(5.34) 


The partial sums therefore have an upper bound. With zero as an obvious 
lower bound, the series ]T u t must converge. 

Divergence is shown as follows. From Eq. 5.31 


2 With u m finite, the partial sum s N will always be finite for N finite. The 
convergence or divergence of a series depends on the behavior of the last 
infinity of terms, not on the first N terms. 
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Thus 


a n u n > a n _ 1 u l ^ l > > a N u N , n > N. 


(5.35) 


and 


u„ > 


a N u 


N U N 


(5.36) 


E M; > Z ( 5 - 37 ) 

i = JV+l I=JV + 1 

If a,' 1 diverges, then by the comparison test Y^i u i diverges. 

Equations 5.30 and 5.31 are often given in a limit form: 


lim ( a n~~ - a n +i) = C. (5.38) 

m^co y u n+1 j 

Thus for C > 0 we have convergence, whereas for C < 0 (and divergent) 

we have divergence. It is perhaps useful to show the equivalence of Eq. 5.38 and 
Eqs. 5.30 and 5.31 and to show why indeterminacy creeps in when the limit 
C = 0. From the definition of limit 


■*n + 1 


< £ 


(5.39) 


for all n > N and all s > 0, no matter how small e may be. When the absolute 
value signs are removed, 


C — e < a n — a„ +1 < C + £. (5.40) 

U n + 1 

Now if C > 0, Eq. 5.30 follows from s sufficiently small. On the other hand, 
ifC < 0, Eq. 5.31 follows. However, if C = 0, the center term a n (u n /u n+1 ) — a n+1 
may be either positive or negative and the proof fails. The primary use of 
Rummer’s test is to prove other tests such as Raabe’s (compare also Exercise 
5.2.3). 

If the positive constants a n of Rummer’s test are chosen a n = n , we have 
Raabe’s test. 


Raccbe's Test 

If u n > 0 and if 

(54i) 

for all n> N, where V is a positive integer independent of n, then Y t - u. converges. 
If 



then Ei u t diverges (Z n 1 diverges). 


< 1 , 


(5.42) 
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The limit form of Raabe’s test is 

lim n ( — 1 J = P. (5.43) 

n ~*° 0 \ M « + 1 / 

We have convergence for P > 1, divergence for P < 1, and no test for P — 1 
exactly as with the Kummer test. This indeterminacy is pointed up by Exercise 
5.2.4, which presents a convergent series and a divergent series with both series 
yielding P = 1 in Eq. 5.43. 

Raabe’s test is more sensitive than the d’Alembert ratio test because n~ x 
diverges more slowly than ^ =1 1. We obtain a still more sensitive test (and one 
that is relatively easy to apply) by choosing a n = n In n. This is Gauss’s test. 


Gauss's Test 

If u n > 0 for all finite n and 

Jt, 

K+i n n 2 


(5.44a) 


in which B(n) is a bounded function of n for n -► oo, then u f converges for 
h > 1 and diverges for h < 1. 

The ratio uju n+x of Eq. 5.44a often comes as the ratio of two quadratic forms: 

u n _ n 2 + a x n + a 0 

u n+i n 2 + b l n + b 0 (5.44 b) 


It may be shown (Exercise 5.2.5) that we have convergence for a x > b x + 1 and 
divergence for a x < b x + 1. 

The Gauss test is an extremely sensitive test of series convergence. It will 
work for all series the physicist is likely to encounter. For h > 1 or h < 1 the 
proof follows directly from Raabe’s test 


lim rt 


= lim 

! 

+ 

n~~* oo 

n n 

n~> oo 

L » J 




(5.45) 


If h = 1, Raabe’s test fails. However, if we return to Rummer’s test and use 
a n — nlnn, Eq. 5.38 leads to 


lim < n In n 

n-*oo 


i + i + 35! 


n n 


— (n T l)ln(?z -j- 1) 


— lim 

n~*ao 


= lim ( n + 1) 


nlnn • — ^ — — (n + l)ln(n + 1)J 

91 


(5.46) 


In n — In n — In 1 + 


Borrowing a result from Section 5.6 (which is not dependent on Gauss’s test), 
we have 
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lim — (n + l)ln( 1 + i 

n~* oo \ Yl 


lim 

n~* oo 


(n + + V"3 


2n 2 


3n 3 


(5.47) 


= - 1 < 0 . 


Hence we have divergence for h— 1. This is an example of a successful applica- 
tion of Kummer’s test in which Raabe’s test had failed. 


EXAMPLE 5.2.4 Legendre Series 


The recurrence relation for the series solution of Legendre’s equation 
(Section 8.5) may be put in the form 


°2j + 2 = 2/(2/ + 1) - /(/ 4- 1) 
<*v (2j + 1)(2/ 4- 2) 

This is equivalent to u 2 j +2 /u 2 j for x = + 1. For j » /, 3 


(5.48) 


> (2j + l)(2j 4- 2) __ 2/ + 2 
u 2j+ 2 2/(2/ T 1) 2 j 



(5.49) 


By Eq. 5.44h the series is divergent. Later we shall demand that the Legendre 
series be finite at x = 1. We shall eliminate the divergence by setting the para- 
meter n — 2j 0 , an even integer. This will truncate the series, converting the 
infinite series into a polynomial. 


Improvement of Convergence 

This section so far has been concerned with establishing convergence as an 
abstract mathematical property. In practice, the rate of convergence may be 
of considerable importance. Here we present one method of improving the rate 
of convergence of a convergent series. Other techniques are given in Sections 
5.4 and 5.9. 

The basic principle of this method, due to Kummer, is to form a linear 
combination of our slowly converging series and one or more series whose 
sum is known. For the known series the collection 

Xl n(n + 1) ~ 1 

= V 1 1 

„=i n(n + 1 ){n + 2) 4 

= A 1 = J_ 

1X3 “i n{n + 1)(« + 2 )(« + 3) 18 


a P = I 

» = 1 


1 

n(n + 1) • ■ • (m + p) 


1 

P’P ! 


The n dependence enters B(n ) but does not affect h. 
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is particularly useful. 4 The series are combined term by term and the coefficients 
in the linear combination chosen to cancel the most slowly converging terms. 


EXAMPLE 5.2.5 Riemann Zeta Function, £(3) 


Let the series to be summed be n 3 . In Section 5.9 this is identified as a 
Riemann zeta function, £(3). We form a linear combination 

Z n~ 3 + a 2 a 2 = Z n ~ 3 + y. 

n = l n=l ^ 

is not included since it converges more slowly than £(3). Combining terms, 
we obtain on the left-hand side 


y^fl d 2 ^ 2 (I F a i) + 3n + 2 

„=i j« 3 n(n + l)(n + 2)j ^ n 3 (n + l)(n + 2) 

If we choose a 2 — — 1, the preceding equations yield 


C(3) = Z » -3 

n = l 


1 ^ 3n + 2 

4 n 3 (n + l)(n + 2 ) 


The resulting series may not be beautiful but it does converge as n -4 , appreciably 
faster than n ~ 3 . A more convenient form comes from Exercise 5.2.21. There, 
the symmetry leads to convergence as n~ 5 . 

The method can be extended including a 3 oc 3 to get convergence as ri ~ 5 , 
a 4 a 4 to get convergence as n -6 , and so on. Eventually, you have to reach a 
compromise between how much algebra you do and how much arithmetic the 
computing machine does. As computing machines get larger and faster, the 
balance is steadily shifting to less algebra for you and more arithmetic for the 
machine. 


EXERCISES 

5 . 2.1 (a) Prove that if 

lim n p u n -*■ A < oo ; p > 1, 

H-+00 

the series u n converges. 

(b) Prove that if 

lim nu n = A > 0, 

n~* oo 

the series diverges. (The test fails for A ~ 0.) 

These two tests, known as limit tests, are often convenient for establishing the 
convergence or divergence of a series. They may be treated as comparison tests, 
comparing with 


4 These series sums may be verified by expanding the forms by partial fractions, 
writing out the initial terms and inspecting the pattern of cancellation of 

positive and negative terms. 
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X" ’ 1 ^ q < p. 


5.2.2 If 


lim ^ = X, 
«-><» a„ 


a constant with 0 < K < oo, show that converges or diverges with 
Hint. If £ a M converges, use 


If £ a„ diverges, use 


K = — V 

" 2K " 




5.2.3 Show that the complete d’Alembert ratio test follows directly from Rummer’s 
test with a { ~ 1. 

5.2.4 Show that Raabe’s test is indecisive for P ~ 1 by establishing that P = 1 for the 
series 

(a) u n = — - — and that this series diverges. 


(b) = ■ 


n\nn 
1 


r and that this series converges. 


«(ln n) 2 

Note. By direct addition £i° 0 > 000 [«(Inn) 2 ] _1 = 2.02288. The remainder of the 
series n > 10 5 yields 0.08686 by the integral comparison test. The total, then, 2 
to oo, is 2.1097. 

5.2.5 Gauss’s test is often given in the form of a test of the ratio 

u n _ n 2 4 - a^n + a 0 
U „+ 1 n 2 + b 1 n + b 0 

For what values of the parameters a 1 and by is there convergence? Divergence? 

ANS. Convergent for a x — by > 1, 
divergent for a x — by < 1 . 


Test for convergence 



(a) £ On«r‘- 

n — 2 

(d) 

X [n(« + 1)]- /2 

n = l 

(b) y — . 

(e) 

f 1 

„=o 2 n + 1 

OO 1 

(c) I , n n - 

„=1 2n(2n - 1) 


Test for convergence 



OO 1 

(a) X 

,-i "(« + 1) 

(d) 


0° 1 

(b) X-r- 

„= 2 n In n 

(e) 

00 1 

y 

00 1 

(c) xy 

„=i «2" 
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5.2.8 For what values of p and q will the following series converge? 

GO 1 

y — - — , 

n= 2 n p (\n n) q 

ANS. Convergent for 

divergent for 


p> 1. 

all q , 

p = 1. 

q> l 

pel, 

all q , 

p = 1, 

q<\. 


5.2.9 


Determine the range of convergence for Gauss’s hypergeometric series 


F(a,fS,y;x) = 1 + ~-x + 
1 !y 


q(q H- !)/?(/? + 1) ^2 
2!y(y + i) 


+ • • • . 


Hint. Gauss developed Gauss’s test for the specific purpose of establishing the 
convergence of this series. 

ANS. Convergent for — 1 < x < 1 and x = ±1 if y > a + /?. 


5.2.1 0 A simple machine calculation yields 


Show that 


100 

I ”' 3 


M = 1 


1.202007. 


1.202056 < £ n ~ 3 < 1.202057. 


Hint. Use integrals to set upper and lower bounds on y* =101 n 3 . 

Comment. A more exact value for summation y® n~ 3 is 1.202056903 .... 

5.2.11 Set upper and lower bounds on y^ 00,000 n~\ assuming that (a) the Euler- 
Mascheroni constant is known. 

ANS. 14.392 726 < y ^ 00 ’ 000 n ~ l < 14.392 727. 
(b) The Euler-Mascheroni constant is unknown. 

5.2.12 Given y^=°i 00 « _1 = 7.485470. . ., set upper and lower bounds on the Euler- 

Mascheroni constant. ANS. 0.5767 < y < 0.5778. 

5.2.1 3 (From Olbers’s paradox.) Assume a static universe in which the stars are uniformly 
distributed. Divide all space into shells of constant thickness; the stars in any 
one shell by themselves subtend a solid angle of co 0 . Allowing for the blocking out 
of distant stars by nearer stars , show that the total net solid angle subtended by 
all stars, shells extending to infinity, is exactly 4n. (Therefore the night sky 
should be ablaze with light.) 


5.2.1 4 Test for convergence 

V f 1-3-5 •.•(2n- l)1 2 J 9 25_ 
hi 2 • 4 • 6 • ■ • (2n) J 4 64 256 


5.2.1 5 The Legendre series, y /even w/x), satisfies the recurrence relations 


u j+ 2 


(x)- 


(j + 1)0’ + 2) - 1(1 + 1) 
U + 2)0* + 3) 


X 2 Uj(x), 


in which the index j is even and / is some constant (but, in this problem, not a 
nonnegative odd integer). Find the range of values of x for which this Legendre 
series is convergent. Test the end points carefully. ANS. — 1 < x < 1. 
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5.2.16 A series solution (Section 8.5) of the Chebyshev equation leads to successive 
terms having the ratio 

= (fc+j) 2 -n 2 

upc) (k + j + l)(fc + j + 2) 

with k = 0 and k = 1. Test for convergence at x — ±1. ANS. Convergent. 

5.2.1 7 A series solution for the ultraspherical (Gegenbauer) function C*(x) leads to the 
recurrence 


a j+ 2 — a j 


(k 4- j)(k 4 j 4 2a) — n(n 4 2a) 
(k+j+ l)(k 4j 4 2) 


Investigate the convergence of each of these series at x = 4 1 as a function of the 
parameter a. ANS. Convergent for a < 

divergent for a > j. 


5.2.18 A series expansion of the incomplete beta function (Section 10.4) yields 


B*(P,<?) = x p 



1 ~ <7 
P+1 


x 4 


(1 - g )( 2 - q ) 2 
2!(p + 2) 


+ • ■ • 


+ (1 - g)(2 - g) • ■ • (n - g) x „ + ) 

n!(p4n) J 

Given that 0 < x < 1, p > 0, and g > 0, test this series for convergence. What 
happens at x = 1 ? 


5.2.1 9 Show that the following series is convergent. 

S (2s - 1) ! ! 
s =o (2s)! !(2s 4 I) 

Note. {2s- 1)!! = (2s - l)(2s — 3) - - 3 • 1 with ( — 1)!! = I. (2s)!! = (2s) (2s - 2) 
• * • 4*2 with 0! ! = 1. The series appears as a series expansion of sin -1 (1) and 
equals n/2. 

5.2.20 Show how to combine f(2) — n~ 2 with and a 2 to obtain a series converging 
as rT A . 

Note. f(2) is actually available in closed form: f( 2) = 7i 2 /6 (see Section 5.9). 


5.2.21 The convergence improvement of Example 5.2.5 may be carried out more 
expediently (in this special case) by putting a 2 into a more symmetric form: 
Replacing n by n — 1, we have 


1 


a 2 = Y - 

n=2 (« - 1M n 4 1) 


1 

4 


(a) Combine C(3) and a 2 to obtain convergence as n~ 5 . 

(b) Let be a 4 with n n — 2. Combine £(3), a 2 , and a 4 to obtain convergence 
asn~ 7 . 

(c) If f(3) is to be calculated to 6 decimal accuracy (error 5 x 10“ 7 ), how many 
terms are required for f(3) alone? combined as in part (a)? combined as in 
part (b)? 

Note. The error may be estimated using the corresponding integral. 


ANS. 


(a) «3) = f - f — 3 — y — 77 - 
4 «= 2 n (n - 1 ) 


5.2.22 Catalan’s constant (/?(2) of AMS-55, Chapter 23) is defined by 
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00 111 

>5(2) = X ( — l)*(2fe + l) -2 = A — T 2 + Tj ' ' ' • 

k = 0 i J D 

Calculate /?( 2) to six-digit accuracy. 

Hint. The rate of convergence is enhanced by pairing the terms: 

1 1 = 16/c 

(4 k ~ l) 2 (4k + l) 2 (16/c 2 - 1) 2 ‘ 

If you have carried enough digits in your series summation, \6k/(16k 2 — l) 2 , 
additional significant figures may be obtained by setting upper and lower bounds 
on the tail of the series, Yj?=n+i- These bounds may be set by comparison with 
integrals as in the Maclaurin integral test. 

ANS. fi(2) = 0.9159 6559 4177. . . . 


5.3 ALTERNATING SERIES 


In Section 5.2 we limited ourselves to series of positive terms. Now, in 
contrast, we consider infinite series in which the signs alternate. The partial 
cancellation due to alternating signs makes convergence more rapid and much 
easier to identify. We shall prove the Leibnitz criterion, a general condition 
for the convergence of an alternating series. 


Leibnitz Criterion 

Consider the series (— 1) M+1 a n with a n > 0. If a n is monotonic decreasing 
(for sufficiently large n ) and lim,,^ a n = 0, then the series converges. 

To prove this, we examine the even partial sums 


S 2n = a l ~ a 2 + ~ ~ a 2m 

S 2n + 2 ~ S 2n + ( a 2/i+l — a 2 n + 2 )' 

Since a 2n+1 > a 2n+2 , we have 


On the other hand. 


S 2n + 2 > S 2n’ 


(5.51) 


(5.52) 


“ (a 2 “ a a) -(a 4 -a 5 )- ■ • • - a 2n+2 . (5.53) 

Hence, with each pair of terms a 2p — a 2p+l > 0, 

s 2n +2<tfi’ (5.5 4) 

With the even partial sums bounded s 2n < s 2n+2 < a x and the terms ^decreasing 
monotonically and approaching zero, this alternating series converges. 

One further important result can be extracted from the partial sums. From 
the difference between the series limit S and the partial sum s„ 

$ ~~ S n — a n+ 1 ~ a n+ 2 + a n+ 3 ~ a n + 4 + 

= a n + 1 ~ ( a n + 2 ~~ a n + 3 ) — ( a n + 4- ~ U n+ 5 ) — ’ ' ' 


or 
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S-s n <a n+l . (5.56) 

Equation 5.56 says that the error in cutting off an alternating series after n 
terms is less than a n+l9 the first term dropped. A knowledge of the error obtained 
this way may be of great practical importance. 

Absolute Convergence 

Given a series of terms u n in which u n may vary in sign, if XKI converges, 
then Y^ u n is said to be absolutely convergent. If converges but XW 
diverges, the convergence is called conditional. 

The alternating harmonic series is a simple example of this conditional 
convergence. We have 

JjC-ir'""-! - J + J-J+ ••• +;- — . (5-57) 

convergent by the Leibnitz criterion; but 

00 ill 1 

= i + i + (5.58) 

n % i 2 3 4 n 

has been shown to be divergent in Sections 5.1 and 5.2. 

The reader will note that all the tests developed in Section 5.2 assume a 
series of positive terms. Therefore all the tests in that section guarantee absolute 
convergence. 


EXERCISES 


5 . 3.1 


(a) 


From the electrostatic two hemisphere problem (Exercise 12.3.20) we obtain 
the series 


l 


s = 0 


(~ l) 5 (4s + 3) 


(2s- 1)!! 
(2s -f 2) ! ! * 


Test for convergence. 

(b) The corresponding series for the surface charge density is 


£(-l) s (4s + 3) 


(2s- 1)!! 


=o (25)!! 

Test for convergence. The ! ! notation is explained in Section 10.1. 


5 . 3.2 Show by direct numerical computation that the sum of the first 10 terms of 

lim ln(l + x) = In 2 = £ (-l)"' 1 ^ 1 

X ^ 1 n = l 

differs from In 2 by less than the eleventh term: In 2 = 0.69314 71806. . . . 


5 . 3.3 In Exercise 5.2.9 the hypergeometric series is shown convergent for x = ±1, if 
y > a 4- p. Show that there is conditional convergence for x = — 1 for y down to 
y > a + — 1. 

Hint. The asymptotic behavior of the factorial function is given by Stirling’s series, 
Section 10.3. 
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5.4 ALGEBRA OF SERIES 

The establishment of absolute convergence is important because it can be 
proved that absolutely convergent series may be handled according to the 
ordinary familiar rules of algebra or arithmetic. 

1. If an infinite series is absolutely convergent, the series 
sum is independent of the order in which the terms 
are added. 

2. The series may be multiplied with another absolutely 
convergent series. The limit of the product will be the 
product of the individual series limits. The product 
series, a double series, will also converge absolutely. 

No such guarantees can be given for conditionally convergent series. Again 
consider the alternating harmonic series. If we write 

1 - i + i - i + • • • - 1 - (± - i) - (i - *) - • • • , (5.59) 

it is clear that the sum 

£ (-1 y-'nr 1 < 1. (5.60) 

n = 1 

However, if we rearrange the terms slightly, we may make the alternating 
harmonic series converge to f . We regroup the terms of Eq. 5.59, taking 

(I + i + 5) (2) + (7 + 9 + TT + 1*3 + Ts) — (4) 

. , (5.61) 

+ (it + • • • + 25) - (i) + (ji + • • ■ + -h) - (s) + 

Treating the terms grouped in parenthesis as single terms for convenience, 
we obtain the partial sums 

St = 1.5333 s 2 = 1.0333 

s 3 - 1.5218 s 4 - 1.2718 

s 5 = 1.5143 s 6 - 1.3476 

s 7 = 1.5103 s 8 = 1.3853 

s 9 = 1.5078 s 10 = 1.4078 

From this tabulation of s n and the plot of s n versus n in Fig. 5.3 the convergence 
to f is fairly clear. We have rearranged the terms, taking positive terms until the 
partial sum was equal to or greater than f , then adding in negative terms until 
the partial sum just fell below §, and so on. As the series extends to infinity, 
all original terms will eventually appear, but the partial sums of this rearranged 
alternating harmonic series converge to f . 

By a suitable rearrangement of terms a conditionally convergent series may 
be made to converge to any desired value or even to diverge. This statement is 
sometimes given as Riemann’s theorem. Obviously, conditionally convergent 
series must be treated with caution. 
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FIG. 5.3 Alternating harmonic series— terms rearranged to give convergence to 
1.5 


Improvement of Convergence, Rational Approximations 

The series 


00 

ln(l + x) = X (- lr^Vn, - 1 < x < 1, (5.61a) 

n = 1 


converges very slowly as x approaches + 1. The rate of convergence may be 
improved substantially by multiplying both sides of Eq. 5.61a by a polynomial 
and adjusting the polynomial coefficients to cancel the more slowly converging 
portions of the series. Consider the simplest possibility: Multiply ln(l + x) by 
1 + a 1 x. 


(1 + a 1 x)ln(l + x) = £ (— 1)" 1 x n /n + a 1 '£{ — 1)" 1 x n+1 /n. 


Combining the two series on the right term by term, we obtain 

V.-1 ( 1 a l 


(1 + a 1 x)ln(l + x) = x + ^ ( — 1) M 


71 = 2 


n n - 1 


x + g(-i)- " (1 7° i) ,; 1 «- 

n =2 n(n - 1) 


Clearly, if we take a 1 =T, the n in the numerator disappears and our combined 
series converges as n~ 2 . 

Continuing this process, we find that (1 + 2x + x 2 )ln(l + x) vanishes as n ~ 3 , 
(1 + 3x + 3x 2 + x 3 )ln(l -F x) vanishes as rc -4 . In effect we are shifting from a 
simple series expansion of Eq. 5.61a to a rational fraction representation in which 
the function ln(l + x) is represented by the ratio of a series and a polynomial: 
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ln(l + x) = 


* + E."2(-ir*7[n(n- 1)] 

1 + X 


Such rational approximations may be both compact and accurate. The SSP 
computer subroutines make extensive use of such approximations. 


Rearrangement of Double Series 

Another aspect of the rearrangement of series appears in the treatment of 
double series (Fig. 5.4): 


Let us substitute 


00 00 


1 1 

m=0 n = 0 


a 


n,m * 


n — q > 0, 
m = p — q > 0, 
(<? < p)- 

This results in the identity 


00 00 00 P 

Z Z a n ,m = Z Z a M.p- q ( 5 - 62 ) 

m=0 n~0 p—0 q=0 

The summation over p and q of Eq. 5.62 is illustrated in Fig. 5.5. The substitution 
I m = 0 1 2 3 


n = 0 


1 

2 

3 



FIG. 5.4 Double series — sum- 
mation over n indicated by ver- 
tical dashed lines 



p= 0 1 2 3 

1 

1! 1 

° 

a 00 a 01 a 02 a 03 

i i i 

1 

1 ! 1 

1 I t 

a l0 a \i a \2 

2 2 

2 

«20 <*21 

i 

3 

1 

30 


FIG. 5.5 Double series — again, 
the first summation is represented 
by vertical dashed lines but these 
vertical lines correspond to diago- 
nals in Fig. 5.4. 
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r = 0 

1 

2 

3 

4 


s = 0 

a oo 

a oi 

a 02 

a 03 

04 


1 



a io 

a ll 

a 12 

FIG. 5.6 Double series. The sum- 







mation over s corresponds to a sum- 

2 





d 7 A 

mation along the almost horizontal 







slanted lines in Fig. 5.4. 


n = s > 0 , 
m = r — 2s > 0 , 



leads to 


oo go oo 0/2] 

Z £ a n ,m = I I «s,r-2 S (5-63) 

tn=0n=0 r=0s=0 

with [r/2] — r/2 for r even, (r — l)/2 for r odd. The summation over r and 5 of 
Eq. 5.63 is shown in Fig. 5.6. Equations 5.62 and 5.63 are clearly rearrangements 
of the array of coefficients a nm , rearrangements that are valid as long as we have 
absolute convergence. 

The combination of Eqs. 5.62 and 5.63, 

oo P oo 0/2] 

Z Z = Z Z a s,r-2s (5.64) 

p-Oq-O r=0s=0 

is used in Section 12.1 in the determination of the series form of the Legendre 
polynomials. 


EXERCISES 


5 . 4.1 Given the series (derived in Section 5.6) 


ln(l + x) ~ x 



show that 


(a) ln(l-x)=-x-y 


3 



(b) 



— 1 < x < 1, 


-1 <x < 1. 


- 1 < X < 1. 


The original series, !n(l + x), appears in an analysis of binding energy in crystals. 
It is j the Madelung constant (2 In 2) for a chain of atoms. The second series (b) is 
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useful in normalizing the Legendre polynomials (Section 12.3) and in developing a 
second solution for Legendre’s differential equation (Section 12.10). 

5 . 4.2 Determine the values of the coefficients a l9 a 2 , and a 3 that will make (1 + a + 
a 2 x 2 + u 3 x 3 )ln(l + x) converge as n -4 . Find the resulting series. 


5 . 4.3 Show that 


(a) X K(») - i] = 1. 

n = 2 

(b) X ( -!)"[((«)- 1]= i 

n=2 


where £(n) is the Riemann zeta function. 


5 . 4.4 Write a program that will rearrange the terms of the alternating harmonic series 
to make the series converge to 1.5. Group your terms as indicated in Eq. 5.61. List 
the first 100 successive partial sums that just climb above 1.5 or just drop below 
1.5, and list the new terms included in each such partial sum. 

ANS. n s n 

1 1.5333 

2 1.0333 

3 1.5218 

4 1.2718 

5 1.5143 


5.5 SERIES OF FUNCTIONS 

We extend our concept of infinite series to include the possibility that each 
term u n may be a function of some variable, u n = u n {x). Numerous illustrations 
of such series of functions appear in Chapters 1 1 to 14. The partial sums become 
functions of the variable x 

s w (x) = u 1 (x) + u 2 (x) + • • • + w„(x), (5.65) 

as does the series sum, defined as the limit of the partial sums 

Y u n (x) = S(x) = lim s„(x). (5.66) 

n = l 

So far we have concerned ourselves with the behavior of the partial sums as 
a function of n . Now we consider how the foregoing quantities depend on x. 
The key concept here is that of uniform convergence. 

Uniform Convergence 

If for any small s > 0 there exists a number N, independent of x in the interval 
[a, fr] (a < x < b) such that 

|S(x) — s M (x)| < a, for all n> N, (5.67) 

the series is said to be uniformly convergent in the interval [a, 6]. This says 
that for our series to be uniformly convergent, it must be possible to find a 
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finite N so that the tail of the infinite series, yy +1 U A X ) |> will be less than an 
arbitrarily small s for all x in the given interval. 

This condition, Eq. 5.67, which defines uniform convergence, is illustrated 
in Fig. 5.7. The point is that no matter how small s is taken to be we can always 
choose n large enough so that the absolute magnitude of the difference between 
S(x) and s M (x) is less than s for all x, a < x < b. If this cannot be done, then 
5> b (x) is not uniformly convergent in [a, b\ 

EXAMPLE 5.5.1 


x 

,? 1 UniX) = „?1 [(« — l)x + 1] [nx + 1] 


(5.68) 


The partial sum s n (x) = nx(nx + l) -1 as may be verified by mathematical 
induction. By inspection this expression for s„(x) holds for n = 1, 2. We assume 
it holds for n terms and then prove it holds for n + 1 terms. 


5 n+1 (x) 5 «W + (-„ JC+1 -](- (n+1);c+1 -| 

_ nx x 

[ nx T 1] [nx + 1] [(n + l)x 4- 1] 

__ (n + l)x 
(n + l)x + 1 ’ 


completing the proof. 

Letting n approach infinity, we obtain 

S{ 0) = lim s n (0) = 0, 

n~*ao 


S(x ± 0) = lim s„(x 0) — 1. 

n~+ oo 


We have a discontinuity in our series limit at x = 0. However, s„(x) is a contin- 
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uous function of x, 0 < * < 1, for all finite n. Equation 5.67 with e sufficiently 
small, will be violated for all finite n . Our series does not converge uniformly. 


Weierstrass M Test 

The most commonly encountered test for uniform convergence is the 
Weierstrass M test. If we can construct a series of numbers Z? M h in which 
Mi > |t/j(x)| for all x in the interval [a, 6] and Z? is convergent, our series 
ZT M f( x ) uniformly convergent in [a, b]. 

The proof of this Weierstrass M test is direct and simple. Since Z; con- 
verges, some number N exists such that for n + 1 > N, 


Z <s. (5.69) 

i—n + l 

This follows from our definition of convergence. Then, with |m £ (x)| < for all 
x in the interval a < x < b, 


I 


i=n + 1 




< 8 . 


(5.70) 


Hence 


| S(x) - s n (x)\ = 


E M i( x ) 


< e. 


(5.71) 


and by definition Zi^i w i( x ) IS uniformly convergent in [ a,b~\. Since we have 
specified absolute values in the statement of the Weierstrass M test, the series 
Z^i u i( x ) is a l so seen to be absolutely convergent. 

The reader should note carefully that uniform convergence and absolute 
convergence are independent properties. Neither implies the other. For specific 
examples, 


yizJL 

nhn + x 2 ' 


— 00 < X < 00 


(5.72) 


and 


E (- 1)"- 1 — = ln(l + x), 0<x<l (5.73) 

« = 1 U 

converge uniformly in the indicated intervals but do not converge absolutely. 
On the other hand, 


E 


n=0 


(1 - x)x" = 1, 


= 0, 


0 < x < 1 


X = 1 


(5.74) 


converges absolutely but does not converge uniformly in [0,1]. 

From the definition of uniform convergence we may show that any series 


oo 


f(x) = E U n( x ) 

n = 1 


(5.75) 
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cannot converge uniformly in any interval that includes a discontinuity of /(x). 

Since the Weierstrass M test establishes both uniform and absolute con- 
vergence, it will necessarily fail for series that are uniformly but conditionally 
convergent. 

Abel's Test 

A somewhat more delicate test for uniform convergence has been given by 
Abel. If 


««(*) = a nfn(x\ 

Y J a n = A, convergent, 

and the functions f n (x) are monotonic [/„+ i(x) < f n (x)~\ and bounded, 0 < f n (x) 
< M, for all x in [a, 6], then converges uniformly in [a,b]. 

This test is especially useful in analyzing power series (compare Section 5.7). 
Details of the proof of Abel’s test and other tests for uniform convergence are 
given in the references listed at the end of this chapter. 

Uniformly convergent series have three particularly useful properties. 

1. If the individual terms u„(x) are continuous, the series sum 

f(x) = £ u„(x) (5.76) 

n~i 

is also continuous. 

2. If the individual terms u n (x) are continuous, the series may be integrated 
term by term. The sum of the integrals is equal to the integral of the sum. 

f fix) dx = £ f u„(x)dx. (5.77) 

Ja n = * Ja 

3. The derivative of the series sum f(x) equals the sum of the individual 
term derivatives, 


£ f(x) - 


provided the following conditions are satisfied. 


(5.78) 


u n (x) and 


du„jx) 

dx 


are continuous in [a,fc]. 


y du n (x) 
„=i dx 


is uniformly convergent in [a, 6], 


Term-by-term integration of a uniformly convergent series 1 requires only 
continuity of the individual terms. This condition is almost always satisfied in 
physical applications. Term-by-term differentiation of a series if often not valid 
because more restrictive conditions must be satisfied. Indeed, we shall en- 


1 Term-by-term integration may also be valid in the absence of uniform 
convergence. 
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counter cases in Chapter 14, Fourier Series, in which term-by-term differentia- 
tion of a uniformly convergent series leads to a divergent series. 


EXERCISES 


5 . 5.1 Find the range of uniform convergence of 


(a) £ 

n = 1 


(-i r 1 

n x 


(b) 


I 


n = l 


n x 


ANS. (a) 1 < x < oo. 

(b) 1 < s < x < oo. 


5 . 5.2 For what range of x is the geometric series £^L 0 x” uniformly convergent? 

ANS. — 1 < —s < x < s < 1 


5 . 5.3 For what range of positive values of x is £*=0 1/(1 -f x”) 

(a) Convergent ? 

(b) Uniformly convergent? 

5 . 5.4 If the series of the coefficients and are absolutely convergent, show that 
the Fourier series 

£ (a„ cos nx + b n sin nx) 
is uniformly convergent for — 00 < x < 00 . 


5*6 TAYLOR'S EXPANSION 


This is an expansion of a function into an infinite series or into a finite series 
plus a remainder term. The coefficients of the successive terms of the series 
involve the successive derivatives of the function. We have already used Taylor’s 
expansion in the establishment of a physical interpretation of divergence 
(Section 1.7) and in other sections of Chapters 1 and 2. Now we derive the 
Taylor expansion. 

We assume that our function /(x) has a continuous nth derivative 1 in the 
interval a < x <b. Then, integrating this nth derivative n times, 


Id 


/•* X 

f ( "\x)dx = / ( ”~ 1) (x) = f n ~ r> (x) - f n ~'Xa) 

J a a 

/ (n, (x)rfxj dx = | dx (5.79) 

= — f {n ~ 2) (a ) — (x — a)/ (M_1) (a). 


Continuing, we obtain 


Baylor’s expansion may be derived under slightly less restrictive conditions, 
compare Jeffreys and Jeffreys, Methods of Mathematical Physics , Section 
1 . 133 . 



304 INFINITE SERIES 



f M (x)(dx) 3 = f"- 3 \x) - f"~ 3 \a) -(x- a)f ( "~ 2 \a) 




(5.80) 


Finally, on integrating for the nth time, 

f (n \x)(dx) n = f{x) - f{a) - (x - a)f'ia) 


{X a)2 ~f"ia) ~^ 7^/ (n_1) (4 


(5.81) 


2! 


in - 1)! 


Note that this expression is exact. No terms have been dropped, no approxima- 
tions made. Now, solving for /(x), we have 


fix) = fid) + (x - a)f'(a) 
(x — a) 2 


+ ■ 


2! 


-/"(a) + 


in- 1)! J W " 


(5.82) 


The remainder, R n , is given by the n-fold integral 

«„ = j • • • j/ <B) (x)(dx)". (5.83) 

This remainder, Eq. 5.83, may be put into perhaps more intelligible form by 
using the mean value theorem of integral calculus 

r x 

gix)dx = (x - a)gi£), (5.84) 

Ja 

with a < £ < x. By integrating n times we get the Lagrangian form 2 of the 
remainder: 


Rn = ^T^/ (n) (0- (5.85) 

n ! 

With Taylor’s expansion in this form we are not concerned with any questions 
of infinite series convergence. This series is finite, and the only questions 
concern the magnitude of the remainder. 

When the function f(x) is such that 

lim R n = 0, (5.86) 


Eq. 5.82 becomes Taylor’s series 


2 An alternate form derived by Cauchy is 
" (n — ])! 


with a < £ < x. 
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m = f(a) + (x- a)f'(a) + +■■■ 

(5.87) 

= 1 

n = 0 n - 

Our Taylor series specifies the value of a function at one point, x, in terms of 
the value of the function and its derivatives at a reference point, a. It is an 
expansion in powers of the change in the variable. Ax = x — a in this case. 
The notation may be varied at the user’s convenience. With the substitution 
x -> x + h and a -> x we have an alternate form 

rx + h)= £ -.rxx). 

«=, bn! 

When we use the operator D = d/dx the Taylor expansion becomes 

°o Unr\n 

f(x + h) = £ ~f(x) = e hD f(x). 

n = 0 

(The transition to the exponential form anticipates Eq. 5.90 that follows.) An 
equivalent operator form of this Taylor expansion appears in Exercise 4.11.1. 
A derivation of the Taylor expansion in the context of complex variable theory 
appears in Section 6.5. 

Maclaurin Theorem 

If we expand about the origin (a = 0), Eq. 5.87 is known as Maclaurin’s 


f(x) = /(0) + xf'i 0) + ~f"(0) + 

CO Y n 

= £ V ,n, ( o). 


An immediate application of the Maclaurin series (or the Taylor series) is in 
the expansion of various transcendental functions into infinite series. 

EXAMPLE 5.6.1 

Let /(x) — e x . Differentiating, we have 

/ (/ °(0) = 1 (5.89) 

for all n, n = 1, 2, 3, .... Then, by Eq. 5.88, we have 

e * =1+x+ !_ + !_ + ... 


3 Note that 0! = 1 (compare Section 10.1). 
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This is the series expansion of the exponential function. Some authors use this 
series to define the exponential function. 

Although this series is clearly convergent for all x, we should check the 
remainder term, R n . By Eq. 5.85 we have 


R n = ± 1 f n XZ) 

n\ 


x" 

n\ 


0 < £ < x. 


(5.91) 


Therefore 

Y n p x 

R n <^r- (5.92) 

n l 

and 

lim R n = 0 (5.93) 

n~*o o 

for all finite values of x, which indicates that this Maclaurin expansion of e x is 
valid over the range — oo < x < oo. 

EXAMPLE 5.6.2 


Let /(x) = ln(l + x). By differentiating, we obtain 
fix) = (1 + x)" 1 , 

/<»>( X ) = (- lf-v - i)!(i + xy n . 


(5.94) 


The Maclaurin expansion (Eq. 5.88) yields 

ln(l + x) = x- y + y- ^+ -- -+ J R„ 

= S(-ir 1 ^ ! + ^. 

p= i p 

In this case our remainder is given by 

r„ = ~f <n xo, o < t < x 

n ! 

< 0 < <^ < x < 1. 


(5.95) 


(5.96) 


Now the remainder approaches zero as n is increased indefinitely, provided 
0 < x < l. 4 As an infinite series 


ln(l + x)= XK-ir 1 ^, 

n = 1 


(5.97) 


This range can easily be extended to — 1 < x < 1 but not to x = ~ 1. 
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which converges for — 1 < x < 1. The range — 1 < x < 1 is easily established 
by the d’Alembert ratio test (Section 5.2). Convergence at x = 1 follows by the 
Leibnitz criterion (Section 5.3). In particular, at x = 1, we have 

In 2 = 1— i + | — 4 + 5 — 

= £(-ir 1 »“ 1 , 

n = 1 

the conditionally convergent alternating harmonic series. 


Binomial Theorem 

A second, extremely important application of the Taylor and Maclaurin 
expansions is the derivation of the binomial theorem for negative and/or 
nonintegral powers. 

Let /(x) = (1 4- x) m , in which m may be negative and is not limited to integral 
values. Direct application of Eq. 5.88 gives 

(1 + x) m = 1 + mx + m(m 2 ~ - 1) x 2 + •■•+/?„. (5.99) 

For this function the remainder is 

R , „ = ^(1 + {r _ “ * Mm - 1) • • • (m - n + 1) (5.100) 

and £ lies between 0 and x, 0 < £ < x. Now, for n > m, (1 + £) w_ " is a maximum 
for £ = 0. Therefore 


R n < — x m(m — 1) 


(m — n 4- 1). 


(5.101) 


Note that the m dependent factors do not yield a zero unless m is a nonnegative 
integer; tends to zero as n -► oo if x is restricted to the range 0 < x < 1. 

The binomial expansion therefore is shown to be 


(1 + x) m = 1 + mx + 


m(m — 1) 2 m(m — 1 )(m — 2) 


2 ! 


-x^ + 


3! 


x J -f 


(5.102) 


In other, equivalent notation 


(i + xr = £ 


m ! 


„=o«!(w - n)\ 

= i ( m )^. 

n = 0 \n/ 


(5.103) 


(m\ 

The quantity I 1, which equals ml/nl(m — n)\ is called a binomial coefficient . 
Although we have only shown that the remainder vanishes, 

lim = 0, 


for 0 < x < 1, the series in Eq. 5.102 actually may be shown to be convergent 
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for the extended range — 1 < x < 1. For m an integer, (m — n)\ = ± oo if n > m 
(Section 10.1) and the series automatically terminates at n = m. 


EXAMPLE 5.6.3 Relativistic Energy 


The total relativistic energy of a particle is 


E = 



Compare this equation with the classical kinetic energy, \mv 2 . 
By Eq. 5.102 with x = — v 2 /c 2 and m = we have 


£ = 



( — 1/2) ( — 3/2) 
2! 



( — 1/2) ( — 3/2) ( — 5/2) 
3! 




(5.104) 


or 


f 2 1 2 2 2 ^ - 5 2 ( V 

E = + -mv + ~mv z • 4- -~mv l * | - 

2 8 c 16 


+ 


The first term, me 2 , is identified as the rest mass energy. Then 


(5.105) 


E,^ Mt =~fnv 2 


\ + l^ + 5jv^ 
+ 4c 2 + 8 c 2 



(5.106) 


For particle velocity v « c, the velocity of light, the expression in the brackets 
reduces to unity and we see that the kinetic portion of the total relativistic 
energy agrees with the classical result. 

For polynomials we can generalize the binomial expansion to 


(a* 4- a 2 + 


+ O" = I 


n\ 


n 1 !n 2 ! 


-aval 


where the summation includes all different combinations of n l9 n 2 , . . . , n m with 
Yj?=t n i ~ n > Here n t and n are all integral. This generalization finds considerable 
use in statistical mechanics. 

Maclaurin series may sometimes appear indirectly rather than by direct use 
of Eq. 5.88. For instance, the most convenient way to obtain the series expansion 


sin 1 x 


f (2n — 1)! ! 

ho ( 2 ^)!! 


x 2 " +1 _ ^ x 3 3x 5 
(2n + 1) ~ X + ~6 + 40 + 


(5.106a) 


is to make use of the relation 


sin 1 x = 


f 


dt 

(1 - t 2 ) 112 ' 


We expand (1 — t 2 ) 1/2 (binomial theorem) and then integrate term by term. 
This term-by-term integration is discussed in Section 5.7. The result is Eq. 5. 106a. 
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Finally, we may take the limit as x -> 1. The series converges by Gauss’s test, 
Exercise 5.2.5. 


Taylor Expansion — More than One Variable 

If the function / has more than one independent variable, say, / = /(x, y ), 
the Taylor expansion becomes 


f(x,y) = f{a 9 b) + (x - a) 4- {y ~ b)^ 


+ 


+ 


i[ (x - 




a)^ + 2(x-a)(,-b) exdy 


82 f , M2 


+ (y- by 


dy 2 


— — — l ^ i / 1 


g x 3 + 3 (* - - ^2 


d 3 f 


(5.107) 


<3x 2 0y 


+ 3(x — a)(y — b) 2 


- 3 -- + (y — by^f 


dxdy 2 ^ J dy 3 


+ 


with all derivatives evaluated at the point (a,b). Using a/ = x } — x j0 , we may 
write the Taylor expansion for m independent variables in the symbolic form 


/(*;)= I 


t n 


“n n ! 


n= 0 




x k= x k o 


A convenient vector form is 


(5.108) 


<A(i r + a)= gl(a-V)Xr). 


(5.109) 


EXERCISES 


5.6.1 Show that 


(2« + l)f 


(a) sinx = £ (— 1)" 

n=0 

(b) cosx= f (-l)"-i—. 

n = 0 (2w)! 

In Section 6.1 e lx is defined by a series expansion such that 

e lx = cos x + i sin x. 

This is the basis for the polar representation of complex quantities. As a special 
case we find, with x = n, 

e in = -1. 


5.6.2 Derive a series expansion of cot x in increasing powers of x by dividing cos x by 
sin x. 

Note. The resultant series that starts with 1/x is actually a Laurent series (Section 
6.5). Although the two series for sin x and cos x were valid for all x, the convergence 
of the series for cot x is limited by the zeros of the denominator, sin x. 
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5 . 6.3 


(a) 

(b) 


Expand (1 + x)ln(l + x) in a Maclaurin series. Find the limits on x for 
convergence. 

From the results for part (a) show that 


1 1 * 

Z Z n = 1 


(-ir 1 

n(n + 1) 


ANS. (a) (1 + x)ln(l + x) = x + £ (-1)" - — 

n =2 n (n — 1) 


-1 1 . 


5 . 6.4 The Raabe test for £ (ft In a) 1 leads to 

hm r (n + l)ln(n + l) 
oo ft In n 

Show that this limit is unity (which means that the Raabe test here is indeter- 
minant). 


5 . 6.5 Show by series expansion that 

ll n ?9 - + i =coth~‘f? 0 , |»/ 0 |>1. 

Z f] 0 — 1 

This identity may be used to obtain a second solution for Legendre’s equation. 

5 . 6.6 Show that f(x) — x m (a) has no Maclaurin expansion but (b) has a Taylor 
expansion about any point x 0 0. Find the range of convergence of the Taylor 
expansion about x = x 0 . 


5 . 6.7 


Let x be an approximation for a zero of /(x) and Ax, the correction. 
Show that by neglecting terms of order (Ax) 2 


Ax = — 


fix) 

fix)' 


This is Newton’s formula for finding a root. Newton’s method has the virtues of 
illustrating series expansions and elementary calculus but is very treacherous. 
See Appendix A1 for details and an alternative. 


5 . 6.8 Expand a function 4>(x, y, z) by Taylor’s expansion. Evaluate 4 >, the average value 
of O, averaged over a small cube of side a centered on the origin and show that the 
Laplacian of is a measure of deviation of O from 0(0, 0, 0). 


5 . 6.9 


The ratio of two differentiable functions /(x) and g{x) takes on the indeterminate 
form 0/0 at x = x 0 . Using Taylor expansions prove L’Hospital’s rule 


lim 


fix) 


*o g(x) 


lim 


fix) 

g’ixY 


5 . 6.10 With n > 1, show that 


(a) 



< 0 , 


(b) 



n 


n+ 1 


> 0 . 


Use these inequalities to show that the limit defining the Euler- Mascheroni 
constant is finite. 



EXERCISES 311 


5.6.11 Expand (1 — 2 tz + t 2 ) 1/2 in powers of t. Assume that t is small. Collect the 


coefficients of t° 9 t l and t 2 . 


ANS. a 0 = P 0 (z) = 1, 
a x = P,(z) = z, 
a 2 = P 2 (z) = j(3z 2 - 1), 
where a n = P„(z), the nth Legendre polynomial. 


5.6.1 2 Using the double factorial notation of Section 10.1, show that 


(i + *r* 2 = K-D 1 


(m + 2n — 2) ! ! 
2"n !(m — 2)!! 


for m — 1, 2, 3, 


5.6.1 3 Using binomial expansions, compare the three Doppler shift formulas: 

-l 

moving source; 


(a) v ' = v^l + 

(b) v' = v (\ + 

(c) v ; 




1 


moving observer; 


relativistic. 


Note. The relativistic formula agrees with the classical formulas if terms of order 
v 2 !c 2 can be neglected. 


5.6.14 In the theory of general relativity there are various ways of relating (defining) a 
velocity of recession of a galaxy to its red shift, <5. Milne’s model (kinematic 
relativity) gives 


(a) v 1 = cS( 1 + jd), 

(b) v 2 = c<S(l + 2<5)(1 + <^) -2 


(c) 


1 -f <5 = 


1 + v 3 /c 
1 - v 3 /c_ 


1/2 


1. Show that for 5 « 1 (and v 3 /c <s< 1) all three formulas reduce to v — cS . 

2. Compare the three velocities through terms of order S 2 . 

Note. In special relativity (with S replaced by z), the ratio of observed wavelength 
k to emitted wave length k 0 is given by 


f = 1 + z = (£±j:f 
/i 0 V c _ v J 


5.6.1 5 The relativistic sum w of two velocities u and v is given by 


YV __ U/ C -f v/c 
c 1 + uv/c 2 


c c 

where 0 < a < 1, find w/c in powers of a through terms in a 3 . 

5.6.16 The displacement x of a particle of rest mass m 0 , resulting from a constant force 
m 0 g along the x-axis, is 
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X = 



including relativistic effects. Find the displacement x as a power series in time t. 
Compare with the classical result 


x 


= 2 gt 


2 


5.6.17 


By use of Dirac’s relativistic theory the fine structure formula of atomic spectros- 
copy is given by 


' = me 2 1 T 


1 - 1/2 


(s + n- |/c|) 2 J 


where 

5 = (\k\ 2 - y 2 ) 1/2 , /c = ±1, ±2, ±3, .... 

Expand in powers of y 2 through order y 4 • (y 2 = Ze 2 jhc , with Z the atomic 
number.) This expansion is useful in comparing the predictions of the Dirac 
electron theory with those of a relativistic Schrodinger electron theory. Experi- 
mental results support the Dirac theory. 


5.6.1 8 In a head-on proton-proton collision, the ratio of the kinetic energy in the center 
of mass system to the incident kinetic energy is 

\flmc 2 {E k + 2 me 2 ) — 2mc 2 

K — — — - — . 


Find the value of this ratio of kinetic energies for 

(a) E k « me 2 (nonrelativistic) 

(b) E k » me 2 (extreme-relativistic) 

ANS. (a) i (b) * 0. The latter an- 
swer is a sort of law of diminish- 
ing returns for high energy 
particle accelerators (with sta- 
tionary targets). 

5.6.1 9 With binomial expansions 


x 

1 - x 


= 2 >", 


n = 1 



1 

1-x- 1 


i 


n = 0 


Adding these two series yields Z*=-oo x " ~ 0- 

Hopefully, we can agree that this is nonsense but what has gone wrong? 


5.6.20 (a) Planck’s theory of quantized oscillators led to an average energy 

00 

£ nso exp( — ne 0 /k T) 

<«>=~ 

£ exp (-nso/kT) 

n — 0 

where e 0 was a fixed energy. Identify numerator and denominator as bino- 
mial expansions and show that the ratio is 

< c > = 52 . 

x 7 exp(c 0 /*T) - 1 

(b) Show that the <e> of part (a) reduces to /cT, the classical result, for kT » e 0 . 


5.6.21 


(a) Expand by the binomial theorem and integrate term by term to obtain the 
Gregory series for tan~ A x: 
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dt 


1 +t 2 


{l-t 2 + t*-t 6 + •■•}* 


Jo 

„2n+l 


= I(-D"f— T, -1<X<1. 

„-o 2n + 1 

(b) By comparing series expansions, show that 

. -i i, (1 — ix 
tan X x = -In - — — 

2 \1 + ix 

Hint. Compare Exercise 5.4.1. 

5 . 6.22 In numerical analysis it is often convenient to approximate d 2 \j/(x)/dx 2 by 

Kx) * + h)~ 2ij/(x) + i//(x - h)~\. 


Find the error in this approximation. 


ANS. Error = ~i// (4) (x) 


5 . 6.23 You have a function y(x) tabulated at equally spaced values of the argument 

fj ’ n = y(x n ) 

|x„ = x + nh. 

Show that the linear combination 

^{-y 2 + 8^ - 8y_, + y_ 2 } 

yields 

, h 4 (5) 

- V °-60 V + ”' 

Hence this linear combination yields y' 0 if (/2 4 /60)y (5) and higher powers of h and 
higher derivatives of y(x) are negligible. 

5 . 6.24 In a numerical integration of a partial differential equation the three-dimensional 
Laplacian is replaced by 

\ 2 ij/(x,y,z) -* h~ 2 [ij/(x H- /z,y,z) + \j/(x — h,y,z) 

+ &(x,y 4- h,z) + \j/(x,y - h,z) + tj/(x y y,z + h ) 

+ \l/(x,y 9 z — h) — 6i/f(x,y,z)]. 

Determine the error in this approximation. Here h is the step size, the distance 
between adjacent points in the x-, y-, or z -direction. 

5 . 6.25 Using double precision, calculate e from its Maclaurin series. 

Note. This simple, direct approach is the best way of calculating e to high accuracy. 
Sixteen terms give e to 16 significant figures. The reciprocal factorials give very 
rapid convergence. 


5.7 POWER SERIES 

The power series is a special and extremely useful type of infinite series of 
the form 
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/(x) = a 0 + a t x + a 2 x 2 + a 3 x 3 + • • • 

= f 

n = 0 

where the coefficients a { are constants, independent of x. 1 


(5.110) 


Convergence 

Equation 5. 1 10 may readily be tested for convergence by either the Cauchy 
root test or the d’Alembert ratio test (Section 5.2). If 

lim = R~ l , (5.111) 

n-+o o CL n 

the series converges for —R<x<R. This is the interval or radius of conver- 
gence. Since the root and ratio tests fail when the limit is unity, the end points 
of the interval require special attention. 

For instance, if a n = n _1 , then R = 1 and, from Sections 5.1, 5.2, and 5.3, the 
series converges for x = — 1 but diverges for x = +1. If a„ — n!, then R = 0 
and the series diverges for all x ^ 0. 


Uniform and Absolute Convergence 

Suppose our power series (Eq. 5.110) has been found convergent for — R < 
x < R; then it will be uniformly and absolutely convergent in any interior 
interval, ~S < x < S, where 0 < S < R. 

This may be proved directly by the Weierstrass M test (Section 5.5) by using 
M i = \a i \S i . 

Continuity 

Since each of the terms w„(x) = a n x n is a continuous function of x and 
f(x) = Z a n xH converges uniformly for —S<x< S,f(x) must be a continuous 
function in the interval of uniform convergence. 

This behavior is to be contrasted with the strikingly different behavior of the 
Fourier series (Chapter 14), in which the Fourier series is used frequently to 
represent discontinuous functions such as sawtooth and square waves. 

Differentiation and Integration 

With u n (x) continuous and Z a n xtl uniformly convergent, we find that the 
differentiated series is a power series with continuous functions and the same 
radius of convergence as the original series. The new factors introduced by 
differentiation (or integration) do not affect either the root or the ratio test. 
Therefore our power series may be differentiated or integrated as often as 
desired within the interval of uniform convergence (Exercise 5.7.13). 


1 Equation 5. 1 10 may be rewritten with z = x + iy, replacing x. The following 

sections will then yield uniform convergence, integrability, and differenti- 

ability in a region of a complex plane in place of an interval on the x-axis. 



POWER SERIES 315 


In view of the rather severe restrictions placed on differentiation (Section 5.5), 
this is a remarkable and valuable result. 

Uniqueness Theorem 

In the preceding section, using the Maclaurin series, we expanded e x and 
ln(l 4- x) into infinite series. In the succeeding chapters functions are frequently 
represented or perhaps defined by infinite series. We now establish that the 
power-series representation is unique. 

If 


fix) = Z a„x n , 

n = 0 

= Z b *x n , 

n~0 

with overlapping intervals of convergence, including the origin, then 

a n = b n (5.113) 

for all n\ that is, we assume two (different) power-series representations and then 
proceed to show that the two are actually identical. 

From Eq. 5.112 

CO 00 

Z a„x n = Z b n x\ -R<x<R, (5.114) 

n—0 0 

where R is the smaller of R a , R b . By setting x = 0 to eliminate all but the constant 
terms, we obtain 

a 0 = h 0 . (5.115) 

Now, exploiting the differentiability of our power series, we differentiate Eq. 
5.113, getting 

00 00 

Z na„x n ~ 1 = £ nb„x n ~ l . (5.116) 

n - 1 M = 1 

We again set x = 0 to isolate the new constant terms and find 

a 1 = b v (5.117) 

By repeating this procees n times, we get 

a n = b n , (5.118) 

which shows that the two series coincide. Therefore our power-series representa- 
tion is unique. 

This will be a crucial point in Section 8.5, in which we use a power series to 
develop solutions of differential equations. This uniqueness of power series 
appears frequently in theoretical physics. The establishment of perturbation 
theory in quantum mechanics is one example. The power-series representation 


-R a <x<R a 
-R h <x < R h , 


(5.112) 



316 INFINITE SERIES 


of functions is often useful in evaluating indeterminate forms, particularly when 
l’Hospital’s rule may be awkward to apply (Exercise 5.7.9). 

EXAMPLE 5.7.1 
Evaluate 


lim 


1 — cosx 


;c-0 X 

Replacing cos x by its Maclaurin series expansion, we obtain 
1 — cos x _ 1 — (1 — x 2 /2! + x 4 /4! — ■ • •) 


(5.119) 


Letting x 0, we have 


x 2 /2! - x 4 /4! + 


2! 4! 


.. 1 — cosx 1 

lim 2 = v 

jc-o x^ 2 


(5.120) 


The uniqueness of power series means that the coefficients a n may be identified 
with the derivatives in a Maclaurin series. From 


00 00 I 

fix) = Z a„x” = Z —J (n \0)x" 

n=0 n= 0 n - 


we have 


1^ 

n! J 


«n=^/ (B> ( 0). 


Reversion (Inversion) of Power Series 
Suppose we are given a series 

y - yo = a i(x - X 0 ) + a 2 (x - x 0 ) 2 + • • • 

oo 

= z a n {x - x 0 )”. 

n = 1 


(5.121) 


This gives (y — y 0 ) in terms of (x — x 0 ). However, it may be desirable to have 
an explicit expression for (x — x 0 ) in terms of ( y — y 0 ). We may solve Eq. 5.121 
for x — x 0 by reversion (or inversion) of our series. Assume that 


x — x 0 = 


z 


M = 1 


K(y - yo)", 


(5.122) 


with the b n to be determined in terms of the assumed known a n . A brute-force 
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approach, which is perfectly adequate for the first few coefficients, is simply to 
substitute Eq. 5.121 into Eq. 5.122. By equating coefficients of (x — x 0 )" on both 
sides of Eq. 5.122, since the power series is unique, we obtain 


b i 




= - 


<*2 

aV 


1 7 

b 3 =~5 ( 2^2 - a r a 3 l 


(5.123) 


b A = —(5 a 1 a 2 a 3 — a\a A — 5a\\ and so on. 

a i 

Some of the higher coefficients are listed by Dwight. 2 A more general and much 
more elegant approach is developed by the use of complex variables in the first 
and second editions of Mathematical Methods for Physicists. 


EXERCISES 


5 . 7.1 


The classical Langevin theory of paramagnetism leads to an expression for the 
magnetic polarization 


P(x) = C 


cosh x 1 
sinh x x 


Expand P(x) as a power series for small x (low fields, high temperature). 


5 . 7.2 The depolarizing factor L for an oblate ellipsoid in a uniform electric field parallel 
to the axis of rotation is 


= -a + CS)(i -Co cor 1 Co), 

s o 

where f 0 defines an oblate ellipsoid in oblate spheroidal coordinates (£, £, <p). 
Show that 


lim L = — (sphere), 

Co^ 00 3 £q 

lim L — — (thin sheet). 

So - * 0 e 0 

5 . 7.3 The corresponding depolarizing factor (Exercise 5.7.2) for a prolate ellipsoid is 

«o \2 Vo - 1 / 

Show that 


2 Dwight, H. B., Tables of Integrals and Other Mathematical Data , 4th ed. 
New York: Macmillan (1961). (Compare Formula No. 50.) 
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5 . 7.5 


lim L = — 

n o^ 00 3e 0 

lim L — 0 

'/o ^ 1 


(sphere), 
(long needle). 


5 . 7.4 The analysis of the diffraction pattern of a circular opening involves 


cos(ecos <p)d(p. 


Expand the integrand in a series and integrate by using 


cos 2 " (pdcp — 


r 


(2»)! 

o 2 2 "(«!) 2 

cos 2 ” +1 cpdcp = 0. 


' 27T, 


The result is 2 te times the Bessel function J 0 (c). 

Neutrons are created (by a nuclear reaction) inside a hollow sphere of radius R. 
The newly created neutrons are uniformly distributed over the spherical volume. 
Assuming that all directions are equally probable (isotropy), what is the average 
distance a neutron will travel before striking the surface of the sphere? Assume 
straight line motion, no collisions. 

(a) Show that 


r — jR 


yjT- k 2 sin 2 0 k 2 dk sin 0 dO. 


(b) Expand the integrand as a series and integrate to obtain 

1 


r = R 


1-31 


„ti (2 n - 1)(2 n + 1)(2 n + 3) 


;} 


(c) 


Show that the sum of this infinite series is yj, giving r = f/L 


Hint. Show that s„ 
Then let n oo. 


Y 2 ~ [4(2 n + l)(2n 4- 3)] 1 by mathematical induction. 


5 . 7.6 Given that 


dx 


■ tan 


71 

4’ 


Jo 1 + 

expand the integrand into a series and integrate term by term obtaining 3 

1 


n , 1 1 1 1 

4= 1 - 3 + 5 - 7 + 9 - 


+ (-l T 


2 n+ 1 


- + 


which is Leibnitz’s formula for n. Compare the convergence (or lack of it) of the 
integrand series and the integrated series at x = 1. 

Leibnitz’s formula converges so slowly that it is quite useless for numerical 
work; n has been computed to 100,000 decimals 4 by using expressions such as 


3 The series expansion of tan -1 x (upper limit 1 replaced by a) was discovered 
by James Gregory in 1671, 3 years before Leibnitz. See Peter Beckmann’s 
entertaining and informative book, A History of Pi, 2nd ed. Boulder, Col. : 
The Golem Press, (1971). 

4 Shanks, D., and J. W. Wrench, Jr., “Computation of k to 100,000 decimals,” 
Math Computation 16 , 76 (1962). 
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7 t = 24 tan -1 4+ 8 tan" 1 37 + 4 tan -1 219, 

71 — 48 tan -1 ^ + 32 tan" 1 57 — 20 tan" 1 239 * 

These expressions may be verified by the use of Exercise 5.6.2. 

5 . 7.7 Expand the incomplete factorial function 

j* e~ t t n dt 

in a series of powers of x for small values of x. What is the range of convergence 
of the resulting series ? Why was x specified to be small ? 

ANS. f e-'fdt 


= x n+i r — 1 
>+ 


i-iyx” 


|_(« + 1) (« + 2) 2!(n + 3) p\(n + p + 1) 

5 . 7.8 Derive the series expansion of the incomplete beta function 

B x (p,q)= f f p_1 (l - t) q ~ l dt 


= x” + A — + ■ ■ ■ 

1 P P + 1 

n\(p + n) 


for 0 < x < 1, p > 0 and q > 0 (if x = 1). 


5 . 7.9 Evaluate 


( a ) lim s * n (tan x ) ~ tan (sin x) 

*-0 x 7 

(b) limx _ 7«(x) for n = 3, 

x -*0 

where j n (x) is a spherical Bessel function (Section 11.7) defined by 


jn(x) = (-1)"X M 
ANS . (a) 


d Y /sin x 
x dx ) x 


(b) ,or "- 3 ' 

5 . 7.10 Neutron transport theory gives the following expression for the inverse neutron 
diffusion length of k : 

< Tr“ nlr '( 5)- 1 

By series inversion or otherwise, determine k 2 as a series of powers of b/a. Give 


the first two terms of the series. 


ANS. k 2 = 3ab f 1 


5 . 7 . 1 1 Develop a series expansion of sinh 1 x in powers of x by 
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(a) reversion of the series for sinh y, 

(b) a direct Maclaurin expansion. 

5.7.1 2 A function f{z) is represented by a descending power series 

/(z) = £ a » z ~"> R<z < GO. 

n~ 0 

Show that this series expansion is unique; that is, if f(z) = b n z R < z < go , 

then = b n for all n. 

5.7.1 3 A power series given by 

/(*) = L 

converges for — R < x < R. Show that the differentiated series and the integrated 
series have the same interval of convergence. (Do not bother about the end 
points x = ±R.) 

5.7.14 Assuming that /(x) may be expanded in a power series about the origin, /(x) = 
2T.oii.x-, with some nonzero range of convergence. Use the techniques em- 
ployed in proving uniqueness of series to show that your assumed series is a 
Maclaurin series : 

«„ = -V ,n) (0). 

n ! 
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(b) 


'1 

0 


lnx 


dx 

1 + x 2 


5.7.1 8 Calculate n (double precision) by each of the following arc tangent expressions: 
7i = 16tan -1 (l/5) - 4tan~ 1 (l/239) 

7i = 24 tan -1 (1/8) + Standi /57) + 4tan' 1 (l/239) 
n - 48tan~ 1 (l/18) + 32tan~ 1 (l/57) - 20 tan" 1 (1/239). 

You should obtain 16 significant figures. 

Note. These formulas have been used in some of the more accurate calculations 
of n. 5 


5.7.1 9 An analysis of the Gibbs phenomenon of Section 14.5 leads to the expression 

2 r sin! 

"Jo t 

(a) Expand the integrand in a series and integrate term by term. Find the 
numerical value of this expression to four significant figures. 

(b) Evaluate this expression by the Gaussian quadrature (Appendix A2). 

ANS. 1.178980. 


5.8 ELLIPTIC INTEGRALS 

Elliptic integrals are included here partly as an illustration of the use of 
power series and partly for their own intrinsic interest. This interest includes 
the occurrence of elliptic integrals in physical problems (Example 5.8.1 and 
Exercise 5.8.4) and applications in mathematical problems. 

EXAMPLE 5.8.1 Period of a Simple Pendulum 



FIG. 5.8 Simple pendulum 


For small amplitude oscillations our pendulum (Figure 5.8) has simple har- 
monic motion with a period T = 2n(l/g) 112 . For a maximum amplitude 0 M large 


5 Shanks, D., and J. W. Wrench, “Computation of n to 100,000 decimals,” 
Math. Computation 16, 76 (1962). 
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enough so that sin Newton’s second law of motion and Lagrange’s 

equation (Section 17.7) lead to a nonlinear differential equation (sin 0 is a 
nonlinear function of 0), so we turn to a different approach. 

The swinging mass m has a kinetic energy of \ml 2 (d()jdt) 2 and a potential 
energy of —mg/ cos 0 (0 = taken for the arbitrary zero of potential energy). 
Since dO/dt = 0 at 8 ~ the conservation of energy principle gives 

~ m 0^ cos ^ = — mgl cos 0 M . (5.124) 

Solving for d6/dt we obtain 

f (COS 0 - cos 0 M ) 1/2 (5. 125) 


with the mass m canceling out. We take t to be zero when 0 = 0 and dO/dt > 0. 
An integration from 0 = 0 to 0 — 0 M yields 

J eM (cos 6 - cos e M )~ 1/2 do = (Jf j dt = ^J /2 r. (5.126) 

This is \ of a cycle, and therefore the time t is £ of the period, T. We note that 
0 < 0 M > and with a bit of clairvoyance we try the half-angle substitution 


sin 



= sin 



sin q>. 


With this, Eq. 5.126 becomes 


T = 4 


GOT 


— sin 2 



sin 2 <p 


- 1/2 


d<p. 


(5.127) 


(5.128) 


Although not an obvious improvement over Eq. 5.126, the integral now defines 
the complete elliptic integral of the first kind, K(sin0 M /2). From the series 
expansion, the period of our pendulum may be developed as a power series — 
powers of sin Om! 2: 


T = 



1 +7 sin 2 
4 


8\f 

2 



8m 

2 



(5.129) 


Definitions 

Generalizing Example 5.8.1 to include the upper limit as a variable, the 
elliptic integral of the first kind is defined as 


or 


F(<p\cc) 



sin 2 a sin 2 0) 1/2 dO 


(5.130a) 



t 2 )(l 


mt 2 )] 1/2 du 


0 < m < 1. 


(5.1306) 


(This is the notation of AMS-55.) For (p — n/2, x = 1, we have the complete 
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elliptic integral of the first kind ; 


K(m) 


n/2 


(1 — msin 2 0) 1/2 dO 


(5.131) 


= [(1 - t 2 )( 1 - mt 2 ) ] 1/2 dt, 

Jo 

with m = sin 2 a, 0 < m < 1. 

The elliptic integral of the second kind is defined by 

r<p 

E((p\ot) = (1 — sin 2 a sin 2 0) 1/2 dO 


or 


E(x\m) 


1 - mt 2 \ 112 
1 - t 2 


dt, 0 < m < 1. 


(5.132a) 


(5.132 b) 


Again, for the case q> = 7i/2, x = 1, we have the complete elliptic integral of the 
second kind : 


7V2 

£(m) = (1 — rasin 2 0) 1/2 d$ 

Jo 

^ ^1 - mt 2 \ 1/2 


1 - r 2 


(5.133) 


dt, 0 < m < 1. 


Exercise 5.8.1 is an example of its occurrence. Fig. 5.9 shows the behavior of 
K(m) and E{m). Extensive tables are available in AMS-55. 


Series Expansion 

For our range 0 < m < 1, the denominator of K(m) may be expanded by the 
binomial series, 


(1 — m sin 2 0) 1/2 = 1 + sin 2 0 + ~m 2 sin 4 0 + 
2 8 

£ (2/i - 1)!! „ . 2nn 

= L - rrr w sin 2 " ft 
„=o (2n)!! 


(5.134) 


For any closed interval [0,/n max ], m max < 1 this series is uniformly convergent 
and may be integrated term by term. From Exercise 10.4.9 


r 


• 2 n n j,) (2n — 1) ! ! n 

sm ede -~mr-r 


(5.135) 


Hence 


K(m) = f I 1 + (I) 2 " 1 + frrT”’ + + ■ ' • ^ < 5 1361 


Similarly, 
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0.5 m 1.0 


FIG. 5.9 Complete elliptic 
integrals, K{m) and E(m) 


E(m ) = | 




T ♦ 3 \ 2 m 2 

2^4 / T 


/ 1 * 3 * 5\ 2 m 3 
\2-4-6) T 


(5.137) 


(Exercise 5.8.2). In Section 13.5 these series are identified as hypergeometric 
functions, and we have 


^(m) = | 2 E 1 (ii l;m) (5.138) 

£(m) = pf 1 (-iil;m). (5.139) 


Limiting Values 

From the series Eqs. 5.136 and 5.137, or from the defining integrals, 

lim K(m) — (5.140) 

m -*0 2 

lim E{m) = (5.141) 

m -+0 2 

For m -» 1 the series expansions are of little use. However, the integrals yield 

lim K(m) = oo, (5.142) 

1 

the integral diverging logarithmically, and 

lim E(m) = 1. 


(5.143) 
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The elliptic integrals have been used extensively in the past for evaluating 
integrals. For instance, integrals of the form 

/ = j* Rit^a^t* -f- a 3 t 3 + a 2 t 2 4- a { t + a 0 )dt, 

where R is a rational function of t and of the radical, may be expressed in terms 
of elliptic integrals. Jahnke and Emde, Chapter 5, give pages of such trans- 
formations. With high-speed computers available for direct numerical evalua- 
tion, interest in these elliptic integral techniques has declined. However, elliptic 
integrals still remain of interest because of their appearance in physical problems 
— Exercises 5.8.4 and 5.8.5. 


EXERCISES 


5 . 8.1 The ellipse x 2 /a 2 + y 2 /b 2 — 1 may be represented parametrically by x = asinO, 
y = bcosO. Show that the length of arc within the first quadrant is 


Here 


M2 

a J (1 — m sin 2 0) 1/2 dO ~ aE(m). 
0 <m~ (a 2 — b 2 )/a 2 < 1. 


5 . 8.2 Derive the series expansion 


E(m) = ^ 1 


l\ 2 m 


2 m 2 


1 \2 *4/ 3 

12 


= *ji_ 

2 1 A I n n ) ! ! 


2\ B tlL (2n)\\ J {2n — 1)J 


5 . 8.3 Show that 


m-> o m 4 


5 . 8.4 A circular loop of wire in the xy-plane, as shown, carries a current L Given that 
the vector potential is 

cos cc dec 


a ( \ - wo 1 r _ 

v Pity,? 2n (a 2 + p 2 A- z 2 — lap cos a) 112 ’ 


show that 


where 


, , , 


K 


-4 


k 2 = 


Aap 


(a + p) 2 + z 2 

Note. For extension of Exercise 5.8.4 to B, see Smythe, page 270. 1 


^mythe, W. R., Static and Dynamic Electricity , 3rd ed. McGraw-Hill, 
New York (1969). 
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5.8.5 An analysis of the magnetic vector potential of a circular current loop leads to 
the expression 

f(k 2 ) = /c~ 2 [(2 - k 2 )K(k 2 ) - 2 E(k 2 )l 

where K(k 2 ) and E{k 2 ) are the complete elliptic integrals of the first and second 
kinds. Show that for k 2 <*c 1 (r » radius of loop) 


f(k 2 ) > 


nk 2 

16 ' 


5.8.6 Show that 


(a) « = !(*-*), 

ak k 


(b> 


dK{k 2 ) 

dk 


E K 
k( 1 -k 2 ) k‘ 


fiint. For part (b) show that 

M2 

E(k 2 ) = (1 - k 2 )\ (1 - k sin 2 0)~ 212 dO 

Jo 

by comparing series expansions. 


5.8.7 (a) Write a function subroutine that will compute E(m ) from the series expan- 

sion, Eq. 5.137. 

(b) Test your function subroutine by using it to calculate E{m) over the range 
m = 0.0(0. 1)0.9 and comparing the result with the values given by AMS-55. 

5.8.8 Repeat Exercise 5.8.7 for K(m). To be written out as in Exercise 5.8.7. 

Note. These series for E(m\ Eq. 5.137, and K(m), Eq. 5.136, converge only very 
slowly for m near 1. More rapidly converging series for E{m) and K(m) exist. See 
Dwight’s Tables of Integrals: 2 No. 773.2 and 774.2. Your computer subroutine 
for computing E and K probably uses polynomial approximations: AMS-55, 
Chapter 17. 


2 Dwight, H. B., Tables of Integrals and Other Mathematical Data. New York : 
Macmillan Co. (1947). 
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5 . 8.9 A simple pendulum is swinging with a maximum amplitude of 0 M . In the limit as 
Qm -> 0, the period is 1 sec. Using the elliptic integral, K(k 2 \ k = sin(0 M /2) calculate 
the period T for 0 M = 0 (10°) 90°. 

Caution . Some elliptic integral subroutines require k = m m as an input parameter, 
not m itself. 

Check values. 0 T (sec) 
10° 1.00193 

50° 1.05033 
90° 1.18258 

5 . 8.10 Calculate the magnetic vector potential A(p, cp,z) = q> 0 Ay(f),(p, z) of a circular 
current loop (Exercise 5.8.4) for the ranges p/a = 2, 3, 4, and z/a = 0, 1, 2, 3, 4. 
Note. This elliptic integral calculation of the magnetic vector potential may be 
checked by an associated Legendre function calculation, Example 12.5.1. 

Check value. For pja = 3 and z/a = 0; 
Ay = 0.029023 p 0 I. 


5.9 BERNOULLI NUMBERS, EULER-MACLAURIN 
FORMULA 


The Bernoulli numbers were introduced by Jacques (James, Jacob) Bernoulli. 
There are several equivalent definitions, but extreme care must be taken, for 
some authors introduce variations in numbering or in algebraic signs. One 
relatively simple approach is to define the Bernoulli numbers by the series 1 


- 1 — 0 n\ ' 


(5.144) 


By differentiating this power series repeatedly and then setting x = 0, we obtain 


= 


d n 


dx n \e x — 1 


*=o 


Specifically, 




= 4-(- 


/ X \ 

1 


[e x - 1 ) 

" e x - 1 

x = 0 * 1 

{e x - l) 2 


c — 0 


1 

"2 ? 


(5.145) 


(5.146) 


as may be seen by series expansion of the denominators. 

Since these derivatives are awkward to evaluate, we may introduce instead 
a series expansion into the defining expression (Eq. 5.144) to obtain 

1. (5.147) 


2 2 
X X * 

X + 2! + 3! + 


B 0 + B 1 x -F B 2 ~ + * • • 


Using the power-series uniqueness theorem (Section 5.7) with the coefficient of 


1 The function x/(e x — 1) may be considered a generating function since it 
generates the Bernoulli numbers. Generating functions that generate the 
special functions of mathematical physics appear in Chapters 11, 12, and 13. 
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TABLE 5.1 Bernoulli Numbers 


n 

B n 

B n 

0 

1 

1.000000000 

1 

1 

~ 2 

- 0.500000000 

2 

1 

6 

0.1666 66667 

4 

1 

~ 30 

- 0.0333 33333 

6 

1 

42 

0.023809524 

8 

1 

~ 30 

- 0.0333 33333 

10 

5 

66 

0.0757 57576 


x° equal to unity and the coefficient of x"(n 0) equal to zero, we obtain 


B o = l 


— B 0 + B 1 — 0, 


h B ° + h Bi + ft = a 




(5.148) 

(5.149) 


Continuing, we have Table 5.1. 

Further values are given in National Bureau of Standards, Handbook of Mathe- 
matical Functions (AMS-55). 


B 2n + 1 ~ 0 n — 1, 2, 3, ... , 

If the variable x in Eq. 5.144 is replaced by 2 ix (and B x set equal to — ^), we 
obtain an alternate (and equivalent) definition of B ln by the expression 

0 ° (Jy) 2 ” 

xcotx=£(-l TB 2n y ^~, - n<x<n . (5.150) 

n=o (2n)l 


Using the method of residues (Section 7.2) or working from the infinite product 
representation of sinx (Section 5.10), we find that 


B 2n = 


I f~ l 2(2n)\ g p _ 2B> 


(2k) 2 


n = 1, 2, 3, 


(5.151) 


p = i 


This representation of the Bernoulli numbers was discovered by Euler. It is 
readily seen from Eq. 5.151 that \B 2n \ increases without limit as n~> oo. Nu- 
merical values have been calculated by Glaisher. 2 Illustrating the divergent 
behavior of the Bernoulli numbers, we have 


B 20 = -5.291 x 10 2 
B 100 = -3.647 x 10 215 . 


(5.152) 


2 Glaisher, J. W. L., ‘Table of the first 250 Bernoulli’s numbers (to nine 
figures) and their logarithms (to ten figures). Trans. Cambridge Phil. Soc. XII, 
390 (1871-1879). 
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Some authors prefer to define the Bernoulli numbers with a modified version 
of Eq. 5.151 by using 


2(2 «)! a 2 „ 

" (2n) 2 " L P ’ 


P = 1 


(5.153) 


the subscript being just half of our subscript and all signs are positive. Again, 
when using other texts or references the reader must check carefully to see 
exactly how the Bernoulli numbers are defined. 

The Bernoulli numbers occur frequently in number theory. The von Standt- 
Clausen theorem states tnat 


Bin — 


Pi Pi Pi 


Pk 


(5.154) 


in which A n is an integer and p 1 ,p 2 , . . . ,p k are prime numbers that exceed by 1, 
a divisor of 2 n. It may readily be verified that this holds for 


B^(A 3 “I? P — 2, 3, 7), 

B h (A 4 = 1, p = 2,3,5), (5.155) 

Biq(A 5 = 1, p — 2, 3, 11), 


and other special cases. 

The Bernoulli numbers appear in the summation of integral powers of the 
integers, 

N 

£ j p , P integral, 

j-i 

and in numerous series expansion of the transcendental functions, including 


and 


tanx, 
cot x, 
esc x, 
ln|sinx|, 
ln|cosx|, 
ln|tanx|, 
tanh x, 
coth x, 

csch x. 


For example, 

x 3 2 S 

tan x = x H — - — I- x + * * • + 


(-ir l 2 2 "(2 2 n -l)B 2 2 .- 1 

(2 n)\ 


+ ■■■. 


(5.156) 
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TABLE 5.2 Bernoulli Functions 
5 0 = 1 

B 2 = X 2 — X + £ 

B 3 = x 3 - f x 2 -f ix 
B A = x* — 2x 3 + x 2 — ^ 

B 5 = x 5 — fx 4 + fx 3 — £x 
B 6 — x 6 — 3x 5 4- fx 4 — ^x 2 + ^ 

£ n (0) = B n , Bernoulli number 

The Bernoulli numbers are likely to come in such series expansions because of 
the defining equations (5.144) and (5.150) and because of their relation to the 
Riemann zeta function 

C(2 k)=Xp“ 2 " (5-157) 

P=1 


Bernoulli Functions 
If Eq. 5.144 is generalized slightly, we have 


^ 7 = z. B n (s)'l 

? ~1 „=0 «! 


(5.158) 


defining the Bernoulli functions, B n (s). The first seven Bernoulli functions are 
given in Table 5.2. 

From the generating function, Eq. 5.158, 


B n (0) = B n , n = 0,1,2, ..., (5.159) 


the Bernoulli function evaluated at zero equals the corresponding Bernoulli 
number. Two particularly important properties of the Bernoulli functions 
follow from the defining relation: a differentiation relation 

B’ n (s) = nB n ^(s\ n= 1,2,3, ..., (5.160) 

and a symmetry relation 

B n ( 1) - (- l) n B n (0\ n = 0, 1, 2, ... . (5.161) 

These relations are used in the development of the Euler-Maclaurin integration 
formula. 


Euler-Maclaurin Integration Formula 

One use of the Bernoulli functions is in the derivation of the Euler-Maclaurin 
integration formula. This formula is used in Section 10.3 for the development 
of an asymptotic expression for the factorial function — Stirling’s series. 

The technique is repeated integration by parts using Eq. 5.160 to create new 
derivatives. We start with 


f{x)dx = 


f(x)B 0 (x)dx. 


(5.162) 
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From Eq. 5.160 and Exercise 5.9.2 


R;(x) = B 0 (x) = 1. 


(5.163) 


Substituting B[(x) into Eq. 5.162 and integrating by parts, we obtain 
1 f(x)dx = /(DB, (1) - /(0)B,(0) - f f'(x)B 1 (x)dx 


= |c/(D + /( 0)] 


- VwBx 

Jo 


(x) dx. 


Again, using Eq. 5.160, we have 
and integrating by parts 

'i 


Bi(*) = iBzM, 


™i 

Jo 


/(*)<** = t[/( 1) + /(0)] - yj[/'(l)B 2 (l) - /'(0)B 2 (0)] 


+ 


/< 2) (x)E 2 (x)dx. 


(5.164) 


(5.165) 


(5.166) 


Using the relations, 


n(l) &2n(0) “ ~ 0, 1, 2, ... 

^2n+l(l) = &2n + l(0) = 0> n = h 2, 3 , ... 
and continuing this process, we have 

£/(*)<& = |[/(1) + /(0)] - i ^~B 2p [/ (2 ' ,_1, (l) - / <2p - 1) (0)] 


(5.167) 


(5.168a) 


This is the Euler-Maclaurin integration formula. It assumes that the function 
f(x) has the required derivatives. 

The range of integration in Eq, 5.168a may be shifted from [0, 1] to [1,2] 
by replacing f(x) by f(x + 1). Adding such results up to \_n — 1 , w], 

f f(x)dx = i/(0) + /( 1) + /( 2) + ■ • • + f(n - 1) + i/(n) 

Jo « 

- I ^B2„[/ <2p_1, (») - / ,2p - 1, (0)] (5.1686) 

+ remainder term. 


The terms ^/(0) + /(l) + ••• 4- %f(n) appear exactly as in trapezoidal 
integration or quadrature. The summation over p may be interpreted as a 
correction to the trapezoidal approximation. Equation 5.1686 is the form used 
in Exercise 5.9.5 for summing positive powers of integers and in Section 10.3 
for the derivation of Stirling’s formula. 
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TABLE 5.3 Riemann Zeta Function 


5 

C(«) 

2 

1.64493 40668 

3 

1.20205 69032 

4 

1.08232 32337 

5 

1.0369277551 

6 

1.01734 30620 

7 

1.0083492774 

8 

1.00407 73562 

9 

1.0020083928 

10 

1.0009945751 


The Euler-Maclaurin formula is often useful in summing series by converting 
them to integrals. 3 * 

Riemann Zeta Function 

This series p~ 2n was used as a comparison series for testing convergence 
(Section 5.2) and in Eq. 5.151 as one definition of the Bernoulli numbers, R 2 n* 
It also serves to define the Riemann zeta function by 

00 

C(s) = £ n_S > 5 > (5-169) 

n=l 

Table 5.3 lists the values of £(s) for integral s, s = 2, 3, . . . , 10. Closed forms for 
even s appear in Exercise 5.9.6. Figure 5.10 is a plot of £(s) — 1. An integral 
expression for this Riemann zeta function appears in Section 10.2 as part of 
the development of the gamma function. 

Another interesting expression for the Riemann zeta function may be derived 
as follows: 

f(s)(l-2-*)=l+^ + ^+--- ~{js+^ + ^+ •••), (5.170) 

eliminating all the n~ s , where n is a multiple of 2. Then 

C(s)(l - 2~ s )(l - 3 -s ) = l+ 3 J + ± + A + l + -.. 

/ x (5-121) 

(3* + 9 s + 15 s + ’ ' ' y 

eliminating all the remaining terms in which n is a multiple of 3. Continuing, 
we have £(s)(l - 2“ 5 )(1 - 3~ s )(l - 5~ s ) • • • (1 - P~% where P is a prime 
number, and all terms n~ s , in which n is a multiple of any integer up through 
P, are canceled out. As P -► 00 , 


3 Compare Boas, R. P., and C. Stutz, “Estimating Sums with Integrals,” 

Am. J. Phys. 39, 745 (1971) for a number of examples. 
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C(s)(l - 2- s )(l - 3“ s ) • • • (1 - P~ s ) = C(s) ft (l-^ _s )=l- (5-172) 

/‘(prime) = 2 

Therefore 


«s) = 


n d- p_s ) 

p(prime) — 2 


-1 


(5.173) 


giving C(s) as an infinite product. 4 

This cancellation procedure has a clear application in numerical computa- 
tion. Equation 5.170 will give £(s)(l — 2 _s ) to the same accuracy as Eq. 5.169 
gives £(s), but with only half as many terms. (In either case, a correction would 
be made for the neglected tail of the series by the Maclaurin integral test 
technique — replacing the series by an integral. Section 5.2.) 

Along with the Riemann zeta function, AMS-55 (Chapter 23) defines three 
other functions of sums of reciprocal powers: 


r,(s) = X ( ■ - lr 1 ■ = (1 — 2 1 ~ S )C(S), n= 1,2,... 

n = 1 


4 This is the starting point for the extensive applications of the Riemann zeta 
function to the number theory. See Edwards, H. M., Riemann's Zeta Function. 

New York: Academic Press (1974). 
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2(s) = X (2n + lp = (1 - 2-'K( S ), n = 2 , 3, . . . 

n = 0 

and 


m = y (-i)"(2« + in 

n = 0 


n — 1 , 2 , .... 


From the Bernoulli numbers (Exercise 5.9.6) or Fourier series (Example 14.3.3 
and Exercise 14.3.13) special values are 


1 


1 


C( 2 ) = 1 + 22 + 32 + ‘ ' ’ - 


C(4) — 1 + ^4 + -j4 + • 




m = 1 


1 j_ 

2 4 + 3 4 

1 . 1 


2(2) — l + 22+^2+--- — 

2(4) = 1 + -TT + ^4 + • 




71 

90 

n 2 

12 

7;r 4 

720 

n 2 


n 

96 

7 T 

4 

tC 

32* 


1 


1 


Catalan’s constant, 

p( 2)^1 + 0.9159 6559 . . . , 

is the topic of Exercise 5.2.22. 


Improvement of Convergence 

If we are required to sum a convergent series Y^=i a n whose terms are rational 
functions of n, the convergence may be improved dramatically by introducing 
the Riemann zeta function. 

EXAMPLE 5.9.1 Improvement of convergence 

The problem is to evaluate the series 1/(1 4- n 2 ). Expanding (1 + n 2 )~ l — 

n~ 2 ( 1 -f n~ 2 )~ l by direct division, we have 
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(1 + n 2 ) 1 = n 2 ^1 — n 2 + n 4 — 

. 1-1 + 1 

n 2 n 4 n 6 n 8 + n 6 

Therefore 

The C functions are tabulated and the remainder series converges as rc -8 . Clearly, 
the process can be continued as desired. You make a choice between how much 
algebra you will do and how much arithmetic the computing machine will do. 

Other methods for improving computational effectiveness are given at the 
end of Sections 5.2 and 5.4. 



EXERCISES 


5.9.1 Show that 


tan x~ i 

h ( 2 «)! 


n n 

— < x < . 

2 2 


Hint, tan x — cot x — 2 cot 2x. 


5.9.2 The Bernoulli numbers generated in Eq. 5.144 may be generalized to Bernoulli 
polynomials, 

xe* s v n 
— 7 = I 


Show that 

B 0 (s) = 1 
B l(s) = S - 2 
B 2 (s) = s 2 - s + i 

Note that B n ( 0) = B n , the Bernoulli number. 

5.9.3 Show that B' n (s) = n = 1, 2, 3, .... 

Hint. Differentiate the equation in Exercise 5.9.2. 

5.9.4 Show that 

B„(1) = (-1)"B„(0). 

Hint. Go back to the generating function, Eq. 5.158 or Exercise 5.9.2. 

5.9.5 The Euler-Maclaurin integration formula may be used for the evaluation of 
finite series: 

Z /( m ) = f ftx)dx + 1(1) + t/(«) + - /’(!)] 4- • • • . 


Show that 
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(a) £ m = $n(n + 1). 

m — 1 

(b) £ m 2 = £n(n + 1)(2 n + 1). 

m = 1 

(c) £ m 3 = in 2 (n + l) 2 . 

m = l 

(d) £ m 4 = ^n(n + l)(2n + l)(3n 2 + 3n - 1). 

m = l 

5.9.6 From 


Show that 


(a) 

K 2 

« 2 )=t 

(d) 

C(8) = 

' 9450 

(b) 

71 ^ 

(e) 

n 10 

C(10) = 

^ ' 93,555 

(c) 





5.9.7 


Planck’s black-body radiation law involves the integral 



Show that this equals 6 £(4). From Exercise 5.9.6 



Hint. Make use of the gamma function, Chapter 10. 


5.9.8 Prove that 


f 


x n e x dx 

(e x ~ l ) 2 


«!« 4 


Assuming n to be real, show that each side of the equation diverges if n = 1. 
Hence the preceding equation carries the condition n > 1. Integrals such as this 
appear in the quantum theory of transport effects — thermal and electrical 
conductivity. 


5.9.9 The Bloch-Gruneissen approximation for the resistance in a monovalent metal 
is 

= 7^ f e/r x 5 dx 

p 0 6 J O (e*- 0(1-0’ 

where 0 is the Debye temperature characteristic of the metal. 

(a) For T -> oo show that 


P 


c x 

4 0 2 
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(b) For X -*• 0, show that 


p as 5!f(5)C 


0 6 ' 


5.9.10 Show that 


(a) 


(b) 


‘ , n<l ±^ ldx = -C(2), 

. x 2 


lim f lna -- x ldx = 


C(2). 


From Exercise 5.9.6, C(2) = 7i 2 /6. Note that the integrand in part (b) diverges for 
a = 1 but that the integrated series is convergent. 


5.9.11 The integral 


foa-*)] 2 - 

Jo x 

appears in the fourth-order correction to the magnetic moment of the electron. 
Show that it equals 2 £(3). 

Hint. Let 1 — x = e~ f . 


5.9.12 Show that 


(In z) : 
1+2 


- dz = 4 1 1 


1 J__ J_ 

3 3 + 5 3 7 3 + 


By contour integration (Exercise 7.2.17), this may be shown equal to n 3 /S. 
5.9.1 3 For “small” values of x 

ln(x!)= -yx+ Y (~lf^x n , 

„=9 n 


where y is the Euler-Mascheroni constant and £(n) the Riemann zeta function. 
For what values of x does this series converge? 

ANS. -1<x<1. 


Note that if x = l, we obtain 


y = I (-1)"— , 

n =2 n 


a series for the Euler-Mascheroni constant. The convergence of this series is 
exceedingly slow. For actual computation of y, other, indirect approaches are 
far superior (see Exercises 5.9.17, 5.10.11, and 10.5.16). 


5.9.1 4 Show that the series expansion of ln(x !) (Exercise 5.9.13) may be written as 


(a) 

(b) 


ln(x!) = |ln 

ln(x!) = - In 
2 


7 IX \ 
sin 7ix ) 


nx \ 
sin nx ) 



C(2 n + 1) 2n+ 

2n+ 1 



+ (1 - y)x 


co v 2 h + 1 


Determine the range of convergence of each of these expressions. 
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5.9.15 Show that Catalan’s constant, /?( 2), may be written as 


p(d = 2 z w - 3 r 2 ~ y 


k = l 

Hint. 1 1 2 = 6£(2). 

5.9.1 6 Derive the following expansions of the Debye functions 

x 


(a) 

(b) 


Jo * - 1 L« 

r fdt s _ 

i —r& 


V' B 2k x 2k 

;+ I ^ 


2(n+l) (2k + n)(2k ) ! 

n ' 1 n(n - l)x"' 2 


|x| < 2n, n > 1, 


x nx 

T + ~H r 


+ 


+ •••+- 


k 3 ' ' k n+l 

' V " J 

x > 0, n > 1. 

The complete integral (0, oo) equals n \ ((n + 1), Exercise 10.2.15. 

5.9.17 Derive the following Bernoulli number series for the Euler- Mascheroni constant. 

y = y s _1 — Inn — — + Y — ~^r. 

7 ,ti 2n k ti (2k)n 2 * 

Hint. Apply the Euler- Maclaurin integration formula to f(x) = x” 1 over the 
range [n, iV]. 

5.9.18 (a) Show that the equation \n2 = %£ =l (—lY +1 s~ 1 . (Exercise 5.4.1) may be 

rewritten as 


In 2= £2~ s C(s)+ 2 (2P)-”- 1 1 1 - 

s=2 p = 1 

Hint. Take the terms in pairs. 

(b) Calculate In 2 to six significant figures. 


1 

2 P 


5.9.19 (a) Show that the equation n/4 = (— l) fl+1 (2n — 1) 1 (Exercise 5.7.6) may 

be rewritten as 

2=1-2 t 4“ 2s C(2s) 

4 s= 1 


- 2 X ( 4 p )- 2 ”- 2 

P= 1 

(b) Calculate n/4 to six significant figures. 


1 - 


(4 P) 2 


5.9.20 Write a function subprogram ZETA (N) that will calculate the Riemann zeta 
function for integer argument. Tabulate £(s) for s — 2, 3, 4, . . . , 20. Check your 
values against Table 5.3 and AMS-55, Chapter 23. 

Hint. If you simply supply the function subprogram with the values of £(2), £(3), 
and C(4), you avoid the more slowly converging series. Calculation time may be 
further shortened by using Eq. 5.170. 

5.9.21 Calculate the logarithm (base 10) of \B 2n \, n— 10, 20, . . . , 100. 

Hint. Program the zeta function as a function subprogram, Exercise 5.9.20. 

Check values, log |B 100 | = 78.45 
log |B 200 | = 215.56. 
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5.10 ASYMPTOTIC OR SEMICONVERGENT SERIES 


Asymptotic series frequently occur in physics. In numerical computations 
they are employed for the accurate computation of a variety of functions. We 
consider here two types of integrals that lead to asymptotic series: first, an 
integral of the form 

poo 

AM = J e~ H f(u)du , 

where the variable x appears as the lower limit of an integral. Second, we consider 
the form 

/ 2 (x) = du, 

with the function / to be expanded as a Taylor series (binomial series). Asymp- 
totic series often occur as solutions of differential equations. An example of 
this appears in Section 11.6 as a solution of Bessel’s equation. 


Incomplete Gamma Function 

The nature of an asymptotic series is perhaps best illustrated by a specific 
example. Suppose that we have the exponential integral function 1 


Ei(x) = 



(5.174) 


or 


J p~ u 

~du = E 1 (x), (5.175) 

to be evaluated for large values of x. Better still, let us take a generalization of 
the incomplete factorial function (incomplete gamma function), 2 

poo 

/(x,p) = e~ u u~ p du = T(1 - p, x), (5.176) 

in which x and p are positive. Again, we seek to evaluate it for targe values of x. 
Integrating by parts, we obtain 


— x poo 

I{x, p) = ~ — p e~“u~ pl du 

x Jx 



~~ + p(p + 1) 


e “u p 2 du. 


(5.177) 


^his function occurs frequently in astrophysical problems involving gas with 
a Maxwell- Boltzmann energy distribution. 

2 See also Section 10.5. 
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Continuing to integrate by parts, we develop the series 


I(x,p) = e- x [-~ 


r P+ 1 


+ 


P(P + 1) 


*-P+ 2 


+ ( - l)" ( P v - W ~ lT| — f e- u u~ p - n du. 

(P-W J* 


(5.178) 


This is a remarkable series. Checking the convergence by the d’Alembert 
ratio test, we find 


lim^i 

n^oo hi 


(p + n)l 1 

= hm~ hr,'- 

n-*< o (p + n — 1)! X 


p + n 


= lim 

n-+ oo X 


(5.179) 


= oo 

for all finite values of x. Therefore our series as an infinite series diverges every- 
where! Before discarding Eq. 5.178 as worthless, let us see how well a given 
partial sum approximates the incomplete factorial function, I(x,p). 

/(x,p)-s n (x,p) = (-ir +1 ^-i^7| e- u u- p ~’'- l du = R n (x,p). (5.180) 

In absolute value 

| I(x,p) - s„(x,p)| < [p i 2 )! | e~ u u- p - n - 1 du. 

When we substitute u = v + x the integral becomes 

*00 7 *oo 

e -« u -p-n - 1 du = e -x e -v( v + dv 

Jx Jo 



For large x the final integral approaches 1 and 

\l(x,p) - S n (x,p)\ * ^3T)7*^t- (5-181) 

This means that if we take x large enough, our partial sum s n is an arbitrarily 
good approximation to the desired function /(x,p). Our divergent series (Eq. 
5.178) therefore is perfectly good for computations. For this reason it is some- 
times called a semiconvergent series. Note that the power of x in the denominator 
of the remainder (p + n + 1) is higher than the power of x in the last term 
included in s n (x,p), (p + n). 

Since the remainder R n (x , p) alternates in sign, the successive partial sums give 
alternately upper and lower bounds for /(x,p). The behavior of the series (with 
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p = 1) as a function of the number of terms included is shown in Fig. 5.11. We 
have 


e x E x {x) = e* 


n ao 

Jx 



1 

X 


_ll + 2!_3! + . 

7 ' T, 4 ~ 

X^ X x * 


(5.182) 


which is evaluated at x = 5. For a given value of x the successive upper and lower 
bounds given by the partial sums first converge and then diverge. The optimum 
determination of e x E 1 (x) is then given by the closest approach of the upper and 
lower bounds, that is, between s 4 = s 6 = 0.1664 and s 5 =0.1741 for x = 5. 
Therefore 

0.1664 < e*E x (x)\ x = 5 < 0.1741. (5.183) 

Actually, from tables, 

e x E 1 (x)\ x=s = 0.1704, (5.184) 

within the limits established by our asymptotic expansion. Note carefully that 
inclusion of additional terms in the series expansion beyond the optimum point 
literally reduces the accuracy of the representation. 

As x is increased, the spread between the lowest upper bound and the highest 
lower bound will diminish. By taking x large enough, one may compute e x E i (x) 
to any desired degree of accuracy. Other properties of E t (x) are derived and 
discussed in Section 10.5. 
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Cosine and Sine Integrals 

Asymptotic series may also be developed from definite integrals — if the 
integrand has the required behavior. As an example, the cosine and sine integrals 
(Section 10.5) are defined by 

f 00 

Ci(x) = — J ~dt, (5.185) 

rao • 

si(x) — — J -~dt. (5.186) 

Combining these with regular trigonometric functions, we may define 


/(x) = Ci(x) sinx — si(x) cosx = 



sin v 
y + x 


dy. 


g(x) — ~Ci(x) cosx — si(x) sinx = 



cosy 
y + x 


dy. 


(5.187) 


with the new variable y = t — x. Going to complex variables, Section 6.1, 
we have 

g(x) + if(x) = f - — dy 

Jo y + x 


ie 


1 + iu 


du , 


(5.188) 


in which u = —iy/x. The limits of integration, 0 to oo, rather than 0 to — too, 
may be justified by Cauchy’s theorem, Section 6.3. Rationalizing the de- 
nominator and equating real part to real part and imaginary part to imaginary 
part, we obtain 


g(x) 


fix) = 


ue 


1 + u 2 


du , 


(5.189) 


1 + u 


du. 


For convergence of the integrals we must require that M(x) > 0. 3 

Now, to develop the asymptotic expansions, let v = xu and expand the factor 
[1 -b (p/x) 2 ]' 1 by the binomial theorem. 4 We have 

„2 n 


(- l )"ZTn dv 


1 foo oo 

x Jo n-o 

1 00 

X Jo M=0 


•X n-0 -X 


■X n = 0 X 


(5.190) 


^2 n 


*<%(x) — real part of (complex) x (compare Section 6.1). 

4 This step is valid for v < x. The contributions from v > x will be negligible 
(for large x) because of the negative exponential. It is because the binomial 
expansion does not converge for v > x that our final series is asymptotic rather 
than convergent. 
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From Eqs. 5.187 and 5.190 


cm i (-1)"^ - ^ i ( -i)" ( 2 - n - t . !! ! 


n — 0 


n = 0 


X* 

,« (2n + 1)! 


., , cosx S . , . (2n) ! sinx ^ . ... 

S '(x) * — I (- 1 )"^--— 2- I (- 1 ) y 2„ 

A n=0 A A n=0 A 


(5.191) 


the desired asymptotic expansions. 

This technique of expanding the integrand of a definite integral and integrat- 
ing term by term is applied in Section 1 1.6 to develop an asymptotic expansion 
of the modified Bessel function K v and in Section 13.6 for expansions of the 
two confluent hypergeometric functions M(a,c;x) and U(a,c;x). 


Definition of Asymptotic Series 

The behavior of these series (Eqs. 5.178 and 5.191) is consistent with the 
defining properties of an asymptotic series. 5 Following Poincare, we take 6 


x"R„(x) = *”[/(x) - s„(x)], 


where 


r x a\ a 2 a n 

s„(x) = «o+^- + p+ -"+^- 

The asymptotic expansion of/(x) has the properties that 
lim x n R n (x ) — 0, for fixed n, 


(5.193) 


(5.194) 


and 


lim x n R n (x) — oo, for fixed x. 7 


(5.195) 


For power series, as assumed in the form of s„(x), R n (x) ~ x rr_1 . With conditions 
(5.194) and (5.195) satisfied, we write 

(5.196) 

n — 0 

Note the use of » in place of =. The function /(x) is equal to the series only 
in the limit as x -> oo. 


5 It is not necessary that the asymptotic series be a power series. The required 
property is that the remainder R n (x) be of higher order than the last term 
kept— as in Eq. 5.194. 

6 Poincare’s definition allows (or neglects) exponentially decreasing functions. 
The refinement of Poincare’s definition is of considerable importance for the 
advanced theory of asymptotic expansions, particularly for extensions into 
the complex plane. However, for purposes of an introductory treatment and 
especially for numerical computation with .v real and positive, Poincare’s 
approach is perfectly satisfactory. 

7 This excludes convergent series of inverse powers of x. Some writers feel that 
this distribution, this exclusion, is artificial and unnecessary. 
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Asymptotic expansions of two functions may be multiplied together and the 
result will be an asymptotic expansion of the product of the two functions. 

The asymptotic expansion of a given function f(t ) may be integrated term by 
term (just as in a uniformly convergent series of continuous functions) from 
x < t < co and the result will be an asymptotic expansion of f(t)dt. Term-by- 
term differentiation, however, is valid only under very special conditions. 

Some functions do not possess an asymptotic expansion; e x is an example 
of such a function. However, if a function has an asymptotic expansion, it has 
only one. The correspondence is not one to one; many functions may have 
the same asymptotic expansion. 

One of the most useful and powerful methods of generating asymptotic 
expansions, the method of steepest descents, will be developed in Section 7.4. 
Applications include the derivation of Stirling’s formula for the (complete) 
factorial function (Section 10.3) and the asymptotic forms of the various Bessel 
functions (Section 11.6). Asymptotic series occur fairly often in mathematical 
physics. One of the earliest and still important approximation treatments of 
quantum mechanics, the WKB expansion, is an asymptotic series. 

Applications to Computing 

Asymptotic series are frequently used in the computations of functions by 
modern high-speed electronic computers. This is the case for the Neumann 
functions N 0 (x) and N^x), and the modified Bessel functions I n (x) and K n (x). 
The relevant asymptotic series are given as Eqs. 11.127, 11.134, and 11.136. 
A further discussion of these functions is included in Section 11.6. The asymp- 
totic series for the exponential integral, Eq. 5.182, for the Fresnel integrals, 
Exercise 5.10.2, and for the Gauss error function, Exercise 5.10.4, are used for 
the evaluation of these integrals for large values of the argument. How large 
the argument should be depends on what accuracy is required. In actual practice, 
a finite portion of the asymptotic series is telescoped by using Chebyshev 
techniques to optimize the accuracy as discussed in Section 13.4. 


EXERCISES 


5 . 1 0.1 Stirling’s formula for the logarithm of the factorial function is 


ln(x !) = - In 2n + (x + In x — x — £ 


B 7 


«ti (2n)(2n - 1) 

The B 2n are the Bernoulli numbers (Section 5.9). Show that Stirling’s formula is 
an asymptotic expansion. 


5 . 1 0.2 Integrating by parts, develop asymptotic expansions of the Fresnel integrals, 
(a) C(x)= ' 


nu , 

cos au 

2 


r*x 2 

(b) S(x) = sin ^—du. 

Jo 2 

These integrals appear in the analysis of a knife-edge diffraction pattern. 



EXERCISES 345 


5.10.3 Rederive the asymptotic expansions of Ci(x) and s/(x) by repeated integration 
by parts. 

f 00 e u 

Hint. Ci(x) + isi(x) = — J — dt. 


5.10.4 


Derive the asymptotic expansion of the Gauss error function 


erf(x) = 




dt 


= 1 -- 


1__ L + All 

2x 2 2 2 x 4 


1-3*5 

2 3 x 6 


■ + • • 


9 r°° 

Hint. erf(x) = 1 — erfc(x) =1 — e~* 2 dt. 

V 71 L 


Normalized so that erf(oo) = 1, this function plays an important role in probabil- 
ity theory. It may be expressed in terms of the Fresnel integrals (Exercise 5. 10.2), 
the incomplete gamma functions (Section 10.5), and the confluent hypergeo met- 
ric functions (Section 13.6). 


5.10.5 


5.10.6 


The asymptotic expressions for the various Bessel functions. Section 11.6, 
contain the series 


’ <2»)!(8.-> ! - 

QW~ f (- ir . n;:r'C4»--(2.-n»] 

yv n=l (2n ~ l)!(8z) 2 " -1 


Show that these two series are indeed asymptotic series. 
For x > 1 


1 

1 + x 


OD 


I 



Test this series to see if it is an asymptotic series. 


5.1 0.7 In Exercise 5.9.17 the Euler-Mascheroni constant y is expressed with a Bernoulli 
number series: 


y !L 2k 

h (2k)n 2k 


Show that this is an asymptotic series. 


5.10.8 


Develop an asymptotic series for 


r*<x> 

e~ xv (l -I- v 2 )~ 2 dv . 
Jo 


Take x to be real and positive. 


ANS. 



( — 1 )” ( 2 «) ! 

x 2n+l 


5.1 0.9 Calculate partial sums of e x E 1 (x) for x = 5, 10, and 15 to exhibit the behavior 

shown in Fig. 5.11. Determine the width of the throat for x = 10 and 15 anal- 

ogous to Eq. 5.183. 

ANS. Throat width: n = 10, 0.000051 

n — 15, 0.0000002. 
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5.1 0.1 0 The knife-edge diffraction pattern is described by 

1 = 0.5/ o {[C(k o ) + 0.5] 2 + [S(u 0 ) + 0.5] 2 }, 

where C(u 0 ) and S(u 0 ) are the Fresnel integrals. Here I 0 is the incident intensity 
and / the diffracted intensity. u 0 is proportional to the distance away from the 
knife edge (measured at right angles to the incident beam). Calculate I/I 0 for 
Uq varying from —1.0 to +4.0 in steps of 0.1. Tabulate your results and, if a 
plotting routine is available, plot them. 

Check value. u 0 = 1.0, I/I 0 = 1.259226. 


5.10.11 The Euler-Maclaurin integration formula of Section 5.9 provides a way of 
calculating the Euler- Mascheroni constant y to high accuracy. Using f(x) = l/x 
in Eq. 5.1686 (with interval [l,n]) and the definition of y, Eq. 5.28, we obtain 


y = Z 


In n (- 

2 n 


z 


B 2k 

{2k)n 2k ' 


Using double precision arithmetic, calculate y. 

Note. Knuth, D. E., “Euler’s constant to 1271 places,” Math. Computation 16, 
275 (1962). An even more precise calculation appears in Exercise 10.5.16. 

ANS. For n — 1000, 

7 = 0.577215664901. 


5. 1 1 INFINITE PRODUCTS 


Consider a succession of positive factors f x • f 2 ' h' /a- ’ ' ' fm (ft > 0). Using 
capital pi to indicate product, as capital sigma indicates a sum, we have 

/i-/2-/3”-/. = fU (5197) 

1 = 1 

We define p n , a partial product, in analogy with s n the partial sum, 

P n = Uf< (5-198) 

i = 1 

and then investigate the limit 

lim p n = P. (5.199) 

n-* oo 

If P is finite (but not zero), we say the infinite product is convergent. If P is 
infinite or zero , the infinite product is labeled divergent. 

Since the product will diverge to infinity if 

lim f n > 1 (5.200) 

n~ oo 

or to zero for 


lim f n < 1, (and >0), 


it is convenient to write our infinite product as 


00 


n (i + <o- 


(5.201) 
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The condition a n -► 0 is then a necessary (but not sufficient) condition for 
convergence. 

The infinite product may be related to an infinite series by the obvious method 
of taking the logarithm 

In ft (1 + a n ) = £ ln(l + «„). (5.202) 

«= 1 1 

A more useful relationship is stated by the following theorem. 

Convergence of Infinite Product 

If 0 < a n < 1, the infinite products Yl™=i (1 + a n ) and F&1 0 “ converge 

if X^=i a n converges and diverge if diverges. 

Considering the term 1 + a n , we see from Eq. 5.90 

1 + a n < e a n (5.203) 

Therefore for the partial product p n 

p n < (5.204) 

and, letting n oo, 

00 OO 

n (1 + a n ) ^ exp X a„, (5.205) 

«=1 n=l 

thus establishing an upper bound for the infinite product. 

To develop a lower bound, we note that 


n n n 


Pn = 

1 + £ a i + Z I! a i a i + ■ • • > 

i — 1 i = 1 j= 1 


(5.206) 

since a { > 0. Hence 

n (1 + a„) > £ a„. 


(5.207) 


n = 1 n — 1 


If the infinite sum remains finite, the infinite product will also. If the infinite 
sum diverges, so will the infinite product. 

The case of f](l — a n ) is complicated by the negative signs, but a proof that 
depends on the foregoing proof may be developed by noting that for a n <\ 
(remember a n -+ 0 for convergence) 

(1 - a„) < (1 + a„y l 

and 

(1 - a„) > (1 + 2a„)- 1 . (5.208) 

Sine, Cosine, and Gamma Functions 
The reader will recognize that an nth-order polynomial P„(x) with n real 
roots may be written as a product of n factors: 

n 

P„{x) = {x - Xi)(x - x 2 ) ■ ■ ■ (x - x„) = n (x - x ; ). 

1=1 


(5.209) 
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In much the same way we may expect that a function with an infinite number 
of roots may be written as an infinite product, one factor for each root. This is 
indeed the case for the trigonometric functions. We have two very useful infinite 
product representations, 


00 

sin x = x Y\ 

n = 1 



oo 

COS X = Yl 



4x 2 

(2 n — l) 2 ^ 2 


(5.210) 

(5.211) 


The most convenient and perhaps most elegant derivation of these two expres- 
sions is by the use of complex variables. 1 By our theorem of convergence, 
Eqs. 5.210 and 5.211 are convergent for all finite values of x. Specifically, for 
the infinite product for sin x, a n = x 2 ln 2 n 2 , 


M = 1 


oo v 2 oo v 2 


n = 1 



(5.212) 


by Exercise 5.9.6. The series corresponding to Eq. 5.211 behaves in a similar 
manner. 

Equation 5.210 leads to two interesting results. First, if we set x = n/ 2, 
we obtain 


i-fn 

Solving for n/. 2, we have 


(2n) 2 


fn 


« = 1 


(2m) 2 - 1 

(2n) 2 


(2m) 2 

(2m - 1)(2m + 1) 

2-2 4-46-6 
1-33-55-7’ 



(5.213) 


(5.214) 


which is Wallis’s famous formula for n/2. 

The second result involves the gamma or factorial function (Section 10.1). 
One definition of the gamma function is 


T(x) = 



(5.215) 


where y is the usual Euler-Mascheroni constant (compare Section 5.2). If we 
take the product of T(x) and T( — x), Eq. 5.215 leads to 


'The derivation appears in Mathematical Methods for Physicists, 1st and 
2nd eds. (Section 7.3). As an alternative Eq. 5.210 can be obtained from the 
Weierstrass factorization theorem. 
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r(x)r(-x) 



Using Eq. 5.210 with x replaced by 7ix, we obtain 



(5.216) 


rwr(-x) 


n 

x sin nx 


(5.217) 


Anticipating a recurrence relation developed in Section 10.1, we have — xr(— x) 
= T(1 — x). Eq. 5.217 may be written as 


T(x)r(l - x) = (5.218) 

sin nx 

This will be useful in treating the gamma function (Chapter 10). 

Strictly speaking, we should check the range of x for which Eq. 5.215 is 
convergent. Clearly, individual factors will vanish for x = 0, — 1, —2, .... The 
proof that the infinite product converges for all other (finite) values of x is left 
as Exercise 5.11.9. 

These infinite products have a variety of uses in analytical mathematics. 
However, because of rather slow convergence, they are not suitable for precise 
numerical work. 


EXERCISES 


5.11.1 Using 

In n (1 ± a„) = £ ln(l ± a „ ) 

n = l n=l 

and the Maclaurin expansion of ln(l + a„ ), show that the infinite product 
n?=i (1 ± a n) converges or diverges with the infinite series £? =1 a n . 


5.11.2 


An infinite product appears in the form 


n 

n = l 


( 1 + 

\1 + b/nj 


where a and b are constants. Show that this infinite product converges only if 
a — b. 


5.1 1 .3 Show that the infinite product representations of sinx and cosx are consistent 
with the identity 2 sin x cos x = sin 2x. 


5.1 1 .4 Determine the limit to which 


nb + 

n=2 V 



converges. 
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5 . 11.5 Show that 


5 . 11.6 Prove that 


n 

n — 2 


1 


n(n 4- 1) 




5 . 1 1 .7 Using the infinite product representations of sinx, show that 

lV 


x cot x = 1 — 2 Y , , , 

„,Yi W 


hence that the Bernoulli number 




5 . 1 1 .8 Verify the Euler identity 

na + /) = fi(i-^r' M<i. 

P=1 g=l 

5 . 11.9 Show that f]*, (1 + xjr)e x,r converges for all finite x (except for the zeros of 
1 + x/r). 

Hint . Write the nth factor as 1 + a n . 


5 . 1 1.10 Calculate cosx from its infinite product representation, Eq. 5.211, using (a) 10, 
(b) 100, and (c) 1000 factors in the product. Calculate the absolute error. Note 
how slowly the partial products converge — making the infinite product quite 
unsuitable for precise numerical work. 

ANS. For 1000 factors cos7t = — 1.00051. 
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6 FUNCTIONS OF A 
COMPLEX 
VARIABLE I 

ANALYTIC PROPERTIES 
MAPPING 


The imaginary numbers are a wonderful 
flight of God's spirit ; they are almost an 
amphibian between being and not being. 

Gotterfied Wilhelm von Leibnitz, 1702 

We turn now to a study of functions of a complex variable. In this area we 
develop some of the most powerful and widely useful tools in all of mathematical 
analysis. To indicate, at least partly, why complex variables are important, we 
mention briefly several areas of application. 

1. For many pairs of functions u and i\ both u and v satisfy Laplace’s 
equation 

V 2 ,// - dVtojQ + _ n 

V ~ dx 2 ey ~ 

Hence either w.or v may be used to describe a two-dimensional electrostatic 
potential. The other function that gives a family of curves orthogonal to those 
of the first function, may then be used to describe the electric field E. A similar 
situation holds for the hydrodynamics of an ideal fluid in irrotational motion. 
The function u might describe the velocity potential, whereas the function v 
would then be the stream function. 

In many cases in which the functions u and v are unknown, mapping or 
transforming in the complex plane permits us to create a coordinate system 
tailored to the particular problem. 

2. In Chapter 8 we shall see that the second-order differential equations of 
interest in physics may be solved by power series. The same power series may 
be used in the complex plane to replace x by the complex variable 2 . The 
dependence of the solution f(z) at a given z 0 on the behavior of f(z) elsewhere 
gives us greater insight into the behavior of our solution and a powerful tool 
(analytic continuation) for extending the region in which the solution is valid. 

3. The change of a parameter k from real to imaginary, k - » i/c, transforms 
the Helmholtz equation into the diffusion equation. The same change transforms 
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the Helmholtz equation solutions (Bessel and spherical Bessel functions) into 
the diffusion equation solutions (modified Bessel and modified spherical Bessel 
functions). 

4. Integrals in the complex plane have a wide variety of useful applications. 

a. Evaluating definite integrals. 

b. Inverting power series. 

c. Forming infinite products. 

d. Obtaining solutions of differential equations for large 
values of the variable (asymptotic solutions). 

e. Investigating the stability of potentially oscillatory 
systems. 

f. Inverting integral transforms. 

5. Many physical quantities that were originally real become complex as 
a simple physical theory is made more general. The real index of refraction 
of light becomes a complex quantity when absorption is included. The real 
energy associated with a nuclear energy level becomes complex when the finite 
lifetime of the energy level is considered. 

6. 1 COMPLEX ALGEBRA 

A complex number is nothing more than an ordered pair of two ordinary 
numbers, (a,b) or a + ib , in which i is ( — 1) 1/2 . Similarly, a complex variable is 
an ordered pair of two real variables, 

z = (x,y) = x + iy. (6.1) 

The reader will see that the ordering is significant, that in general a -f ib is not 
equal to b + ia and x + iy is not equal to y + ix} 

It is frequently convenient to employ a graphical representation of the com- 
plex variable. By plotting x — the real part of z — as the abscissa and y — the 
imaginary part of z — as the ordinate, we have the complex plane or Argand 
plane shown in Fig. 6. 1. If we assign specific values to x and y, then z corresponds 
to a point (x,y) in the plane. In terms of the ordering mentioned before, it is 
obvious that the point (x,y) does not coincide with the point (y,x) except for 
for the special case of x = y. 

All our complex variable analyses can be developed in terms of ordered 
pairs 2 of numbers (a,b), variables (x,y), and functions (u(x,y), r(x,y)). The i is 
not necessary but it is convenient. It serves to keep pairs in order — somewhat 
like the unit vectors of Chapter 1. 


1 The algebra of complex numbers, a -f ib, is isomorphic with that of matrices 
of the form 



(compare Exercise 4.2.4). 

2 This is how a computer would do complex arithmetic. 
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y 



FIG. 6. 1 Complex plane — Argand diagram 

In Chapter 1 the points in the xy-plane are identified with the two-dimen- 
sional displacement vector r = ix + j y. As a result, two-dimensional vector 
analogs can be developed for much of our complex analysis. Exercise 6.1.2 is 
one simple example; Cauchy’s theorem, Section 6.3, is another. 

Further, from Fig. 6.1 we may write 

x = rcos 0 

( 6- 2) 

y = r sin 0 
and 

z = r(cos 0 + i sin 0). (6.3) 

Using a result that is suggested (but not rigorously proved) 3 by Section 5.6, 
we have the very useful polar representation 

z = re ie . (6.4) 

In this representation r is called the modulus or magnitude of z (r = |z|) and 
the angle 9 is labeled the argument or phase of z. 

The choice of polar representation, Eq. 6.4, or cartesian representation, 
Eq. 6.1, is a matter of convenience. Addition and subtraction of complex 
variables are easier in the cartesian representation. Multiplication, division, 
powers, and roots are easier to handle in polar form. 

Analytically or graphically, using the vector analogy, we may show that the 
modulus of the sum of two complex numbers is no greater than the sum of the 
moduli and no less than the difference, Exercise 6.1.3, 

|zj| - \z 2 \ < \zi + z 2 1 < |z t | + |z 2 |. (6.5) 

Because of the vector analogy, these are called the triangle inequalities. 

Using the polar form, Eq. 6.4, we find that the magnitude of a product is the 


3 Strictly speaking, Chapter 5 was limited to real variables. However, we can 
define e x as z n /n \ for complex z. The development of power-series expan- 

sions for complex functions is taken up in Section 6.5 (Laurent expansion). 
Alternatively, e z can be defined by Eqs. 6.3 and 6.4. 
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product of the magnitudes, 


Also, 


Zl-Z 2 | = \ Z l\'\ Z 2 


arg(z j • z 2 ) = arg z ! + arg z 2 . 


( 6 . 6 ) 

(6.7) 


From our complex variable z complex functions f(z) or w(z) may be con- 
structed. These complex functions may then be resolved into real and imaginary 
parts 

w(z) = u(x 9 y ) + iv(x 9 y), (6.8) 

in which the separate functions n(x,y) and v(x,y) are pure real. For example, 
if f(z) = z 2 , we have 

/(z) = (x + iyf 

= (x 2 — y 2 ) + i2xy. 


The real part of a function f(z) will be labeled J?/(z), whereas the imaginary 
part will be labeled Jf{z). In Eq. 6.8 


^w(z) = u(x , y), 


J^w(z) = ?;(x,y). 


The relationship between the independent variable z and the dependent 
variable w is perhaps best pictured as a mapping operation. A given z = x + iy 
means a given point in the z-plane. The complex value of w(z) is then a point 
in the w-plane. Points in the z-plane map into points in the w-plane and curves 
in the z-plane map into curves in the w-plane as indicated in Fig. 6.2. 



FIG. 6.2 The function w(z) = u(x, y) + iv(x , y) maps points in the xy-plane into points in 
the uv plane. 

Complex Conjugation 

In all these stops, complex number, variable, and function, the operation of 
replacing i by —i is called “taking the complex conjugate.” The complex 
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The complex variable z and its complex conjugate z* are mirror images of each 
other reflected in the x-axis, that is, inversion of the j/-axis (compare Fig. 6.3). 
The product zz* leads to 


zz* = (x + iy)(x — iy) — x 2 + y 2 
= r 2 . 

Hence 


( 6 . 10 ) 


(zz*) 1 ' 2 = \z\, 

the magnitude of z. 


Functions of a Complex Variable 

All the elementary functions of real variables may be extended into the 
complex plane — replacing the real variable x by the complex variable z. This is 
an example of the analytic continuation mentioned in Section 6.5. The extremely 
important relation, Eq. 6.4, is an illustration of this. Moving into the complex 
plane opens up new opportunities for analysis. 


EXAMPLE 6.1.1 De Moivre’s Formula 


If Eq. 6.3 is raised to the nth power, we have 

e ine = (cos 9 + i sin 6) n . (6. 1 1) 

Expanding the exponential now with argument nQ, we obtain 

cos n6 + i sin nO = (cos 0 + i sin Of. (6. 12) 

This is De Moivre’s formula. 

Now if the right-hand side of Eq. 6.12 is expanded by the binomial theorem, 
we obtain cos nO as a series of powers of cos 0 and sin 0, Exercise 6.1.6. 

Numerous other examples of relations among the exponential, hyperbolic, 
and trigonometric functions in the complex plane appear in the exercises. 

Occasionally there are complications. The logarithm of a complex variable 
may be expanded using the polar representation 


4 The complex conjugate is often denoted by z. 
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lnz = In re id 
= In r + iO . 


(6.13 a) 


This is not complete. To the phase angle 0 , we may add any integral multiple 
of In without changing z. Hence Eq. 6.13a should read 


Inz = lnr£ l(0+2n7c) 

= In r + i(0 + 2m). 


(6.13ft) 


The parameter n may be any integer. This means that lnz is a multivalued 
function having an infinite number of values for a single pair of real values r 
and 9. To avoid ambiguity, we usually agree to set n — 0 and limit the phase 
to an interval of length 2 n such as ( — n, n). The line in the z-plane that is not 
crossed, the negative real axis in this case, is labeled a cut line. The value of In z 
with n = 0 is called the principal value of In z. 

Further discussion of these functions, including the logarithm, appears in 
Section 6.6. 


EXERCISES 


6.1.1 (a) Find the reciprocal of x + iy, working entirely in the cartesian representa- 

tion. 

(b) Repeat part (a), working in polar form but expressing the final result in 
cartesian form. 


6.1.2 The complex quantities a = u + iv and b = x + iy may also be represented as 
two-dimensional vectors, a = iu + \v, b = ix + jy. Show that 

a*i> = a • b + ik • a x b. 


6.1 .3 Prove algebraically that 

|z,| - |z 2 | < I z, + z 2 1 < |z,| + \z 2 \. 

Interpret this result in terms of vectors. 

Prove that 

\z — 1 1 < \\fz r - 1 1 < \z 4- 1 1, for tf(z) > 0. 

6.1 .4 We may define a complex conjugation operator K such that Kz = z*. Show that 
K is not a linear operator. 

6.1.5 Show that complex numbers have square roots and that the square roots are 
contained in the complex plane. What are the square roots of /? 


6.1.6 Show that 


(a) cos nO = cos " 0 — cos" 2 0sin 2 0 + \ (cos" 4 0s\n 4 0 


(b) sin nO = ( ) cos" 1 0 sin 0 — ( ^ ) cos" 3 0 sin 3 0 + ( ^ ) cos" 5 0 sin 5 0 — 


Note. The quantities are binomial coefficients : 

\m 


m) (n — m) ! m ! 
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6.1.7 


6.1.8 


6.1.9 


6.1.10 


Prove that 

, , V sin N(x/2) iAT ,,x 

(a) £ coswc = — r— L ~cos(Af - 1)-, 

n % sin x/2 2 

N ~ 1 . siniV(x/2) . x 

(b) X smnx = : sin(AT - 1)-. 

„=o sin x/2 2 

These series occur in the analysis of the multiple-slit diffraction pattern. Another 
application is the analysis of the Gibbs phenomenon, Section 14.5. 

Hint. Parts (a) and (b) may be combined to form a geometric series (compare 
Section 5.1). 


( a ) X P n cos n * = t~ 

„ = n t 


2pcosx + p 2 ’ 
psinx 


For — 1 < p < 1 prove that 

1 — p cos x 

p tus nx = — 

n = 0 

oo 

(b) X P n sinnx - - 2 . 

„=o 1 — 2p cos x + 

These series occur in the theory of the Fabry- Perot interferometer. 

Assume that the trigonometric functions and the hyperbolic functions are defined 
for complex argument by the appropriate power series 

oo n oo _2.s+l 

sinz = £ (-1 r~ w - x = I (- 

n= 1 . odd n - s—0 (2.V + 1 ) ! 




12 . 0 ! 


oo _n oo _2s + l 


sinh 2 = V _ = V — 

,,=n>dd”! a =o(2s+ 1)1 

00 « oo 2s 

cosh z = y - = y — — , 

n =L^n\ jto (2s) ! 

(a) Show that 

i sin z = sinh iz, sin /'z = i sinh z, 

cos z = cosh iz, cos iz = cosh z. 

(b) Verify that familiar functional relations such as 

e z + e~ z 

coshz = , 

2 

sin(z x + z 2 ) = sin Zj cos z 2 + sin z 2 cos z x , 
still hold in the complex plane. 

Using the identities 

e iz -b e~ is 

cosz — , 

2 

e iz _ e ~ iz 


2 i 


established from comparison of power series, show that 
(a) sin(x + iy) = sin x cosh y + i cos x sinh y , 
cos(x + iy) = cos x cosh y — i sin x sinh y, 
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(b) |sinz| 2 = sin 2 x + sinh 2 y ? 

|cosz| 2 = cos 2 x 4- sinh 2 y. 

This demonstrates that we may have sin z, cos z > 1 in the complex plane. 

6.1 .1 1 From the identities in Ex. 6.1.9 and 6.1.10 show that 

(a) sinh(x 4- iy ) — sinh x cos y 4- i cosh x sin y , 
cosh(x + iy) = cosh x cos y 4- i sinh x sin y, 

(b) |sinhz| 2 = sinh 2 x 4- sin 2 y, 

|coshz| 2 = sinh 2 x 4- cos 2 y. 

6 . 1.12 Prove that 

(a) |sinz| > |sinx| 

(b) |cosz| > |cosx|. 

6 . 1 .1 3 Show that the exponential function e z is periodic with a pure imaginary period 
of Ini. 


6 . 1.14 Show that 

(a) tanh(z/2) = 

(b) coth(z/2) = 


sinh x 4- i sin y 
cosh x 4- cos y 9 

sinh x — i sin y 
cosh x — cos y 


6 . 1.15 

6 . 1.16 


Find all the zeros of 

(a) sin z, 


(c) 

sinh z, 

(b) cos z, 


(d) 

coshz. 

Show that 

(a) sin 1 z — —i In (iz ± y/l - 


<d) 

sinh 1 

(b) cos -1 z = — i ln(z + yz 2 ^ 

-1), 

(e) 

cosh” 1 

(c) tan = 2 ln (,- _ z )’ 


(0 

tanh" 1 


z = ln(z 4- J7~ + 1) , 
z — ln(z 4- y[? — 1), 


z = - In 
2 


l_+z 

i : 


Hint . 1. Express the trigonometric and hyperbolic functions in terms of exponen- 
tials. 2. Solve for the exponential and then for the exponent. 


6 . 1.17 In the quantum theory of the photoionization we encounter the identity 

ib 

| = exp( — 2bcot~ 1 a), 

in which a and b are real. Verify this identity. 


(ia — 1 
4- 1 


6 . 1.18 A plane wave of light of angular frequency w is represented by 

gi(a(t-nx/c) 

In a certain substance the simple real index of refraction n is replaced by the 
complex quantity n — ik. What is the effect of k on the wave? What does k corre- 
spond to physically? The generalization of a quantity from real to complex form 
occurs frequently in physics. Examples range from the complex Young’s modulus 
of viscoelastic materials to the complex potential of the “cloudy crystal ball” 
model of the atomic nucleus. 


6 . 1.19 We see that for the angular momentum components defined in Exercise 2.5.14 

(L x - i L y ) + (L x + iL y }* 


Explain why this occurs. 
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6.1 .20 Show that the phase of /(z) = u + iv is equal to the imaginary part of the log- 
arithm of f(z). Exercise 10.2.13 depends on this result. 

6.1 .21 (a) Show that e XnZ always equals z. 

(b) Show that In e z does not always equal z. 

6.1 .22 The infinite product representations of Section 5.11 hold when the real variable 
x is replaced by the complex variable z. From this, develop infinite product 
representations for 

(a) sinh z 

(b) cosh z. 


6.1 .23 The equation of motion of a mass m relative to a rotating coordinate system is 


d 2 r ^ ^ / dr . 

r — r = F — mo) x (to x r) — 2m to x — — m 
dt 2 \ dt 1 


dm 

dt 


x r 


Consider the case F = 0, r = ix + \y, and to = cuk, with ca constant. Show that 
the replacement of r — ix + jy by z = x + iy leads to 

d 2 z dz j 

~rj + i2oj ~ — orz = 0. 
dt 2 dt 


Note. This differential equation may be solved by the substitution a — fe ,0^, . 


6.1 .24 Using the complex arithmetic available in FORTRAN IV, write a program that 
will calculate the complex exponential e z from its series expansion (definition). 
Calculate e z for z = e innl 6 , n = 0, 1, 2, . . . , 12. Tabulate the phase angle (nn/6), 
^(z), J{z\ ${e z \ J{e z ) \e z \ , and the phase of e z . 

Check value, n = 5, (9 = 2.61799, #(z) = -0.86602, 

J{z) = 0.50000, m(er) = 0.36913, J(e z ) = 0.20166, 
\e z \ = 0.42062, phase(r) = 0.50000. 

6.1 .25 Using the complex arithmetic available in FORTRAN IV, calculate and tabu- 
late ^2(sinhz), ^(sinhz), |sinhz|, and phase(sinhz) for x — 0.0 (0.1) 1.0 and y 
= 0.0 ( 0 . 1 ) 1 . 0 . 

Hint. Beware of dividing by zero if calculating an angle as an arc tangent. 

Check value, z = 0.2 F O.li, £#(sinh z) = 0.20033, 

./(sinh z) = 0.10184, |sinh z\ — 0.22473, 
phase(sinh z) = 0.47030. 


6.1 .26 Repeat Exercise 6.1.25 for cosh z. 


6.2 CAUCHY-RIEMANN CONDITIONS 

Having established complex functions of a complex variable, we nowproceed 
to differentiate them. The derivative of /(z), like that of a real function, is defined 

by 

dz~+0 Z + OZ — Z dz-*0 OZ 


-f or m 


( 6 . 14 ) 
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i 8x — > 0 Zq 

^ = o T 


Sx = 0 
8y-> 0 


T FIG. 6.4 Alternate approaches to z 0 


provided that the limit is independent of the particular approach to the point z. 
For real variables we require that the right-hand limit (x — ► x 0 from above) 
and the left-hand limit (x — ► x 0 from below) be equal for the derivative df(x)/dx 
to exist at x = x 0 . Now, with z (or z 0 ) some point in a plane , our requirement 
that the limit be independent of the direction of approach is very restrictive. 

Consider increments Sx and Sy of the variables x and y 9 respectively. Then 



Sz = Sx -f i Sy. 

(6.15) 

Also, 

Sf = Su + iSv 9 

(6.16) 

so that 

Sf Su + iSv 

Sz Sx-\-iSy 

(6.17) 


Let us take the limit indicated by Eq. 6.14 by two different approaches as shown 
in Fig. 6.4. First, with Sy — 0, we let Sx — ► 0. Equation 6.14 yields 


(6.18) 

du . dv 

h t — , 

ax ax 


lim 


»f . 


o Sz 


.. * Su , .Sv 

= lim — -F i — 


<5x Sx 


assuming the partial derivatives exist. For a second approach, we set <5x = 0 
and then let Sy -► 0. This leads to 


r Sf r ( .Su Sv 
lim — = lim ( —i - — h 


sz~*o Sz <5y^o \ Sy Sy 

.du dv 
l dy + dy' 


(6.19) 


If we are to have a derivative df/dz, Eqs. 6.18 and 6.19 must be identical. 
Equating real parts to real parts and imaginary parts to imaginary parts (like 
components of vectors), we obtain 


( 6 . 20 ) 


du _ dv du _ dv 

dx dy 9 dy dx 

These are the famous Cauchy-Riemann conditions. They were discovered by 
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Cauchy and used extensively by Riemann in his theory of analytic functions. 
These Cauchy-Riemann conditions are necessary for the existence of a deriva- 
tive of /(z), that is, if df/dz exists, the Cauchy-Riemann conditions must hold. 

Conversely, if the Cauchy-Riemann conditions are satisfied and the partial 
derivatives of u(x,y) and p(x,y) are continuous, the derivative df/dz exists. 
This may be shown by writing 




du . SiA 
dx 1 dx) 


Sx + 




dy. 


( 6 . 21 ) 


The justification for this expression depends on the continuity of the partial 
derivatives of u and v. Dividing by Sz , we have 


Sf _ (du/dx + i(dv/dx))Sx + ( du/dy + i(du/dy))Sy 
Sz Sx + i Sy 

(du/dx + i(dv/dx)) + ( du/dy + i(dv/dy))Sy/Sx 
1 + i(Sy/Sx) 


( 6 . 22 ) 


If Sf/Sz is to have a unique value, the dependence on Sy/Sx must be eliminated. 
Applying the Cauchy-Riemann conditions to the y derivatives, we obtain 


du .dv_ _ _dv_ du _ .fdu 
dy dy dx dx 1 \3x 



(6.23) 


Substituting Eq. 6.23 into Eq. 6.22, we may cancel out the Sy/Sx dependence 
and 


Sf _ du . dv 
Sz dx dx ’ 


(6.24) 


which shows that lim Sf/Sz is independent of the direction of approach in the 
complex plane as long as the partial derivatives are continuous. 

It is worthwhile noting that the Cauchy-Riemann conditions guarantee 
that the curves u = c 1 will be orthogonal to the curves v = c 2 (compare Section 
2.1). This is fundamental in application to potential problems in a variety of 
areas of physics. If u = c x is a line of electric force, then v ~ c 2 is an equipotential 
line (surface), and vice versa. A further implication for potential theory is 
developed in Exercise 6.2.1. 


Analytic Functions 

Finally, if f(z) is differentiable at z = z 0 and in some small region around z 0 , 
we say that /(z) is analytic 1 at z = z 0 . If /(z) is analytic everywhere in the (finite) 
complex plane, we call it an entire function. Our theory of complex variables 
here is essentially one of analytic functions of complex variables, which points 
up the crucial importance of the Cauchy-Riemann conditions. The concept of 
analyticity carried on in advanced theories of modern physics plays a crucial 
role in dispersion theory (of elementary particles). If f\z) does not exist at 


Some writers use the term holomorphic. 
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z = z 0 , then z 0 is labeled a singular point and consideration of it is postponed 
until Section 7.1. 

To illustrate the Cauchy-Riemann conditions, consider two very simple 
examples. 

EXAMPLE 6.2.1 

Let f(z) = z 2 . Then the real part u(x,y) = x 2 — y 2 and the imaginary part 
v(x,y) = 2 xy. Following Eq. 6.20, 

du __ ? _ dv 6u _ ? _ dv 

Jx~ X ~dy’ 8y~ ~ 

We see that /(z) = z 2 satisfies the Cauchy-Riemann conditions throughout the 
complex plane. Since the partial derivatives are clearly continuous, we conclude 
that f{z) — z 2 is analytic. 

EXAMPLE 6.2.2 

Let f(z) = z*. Now u = x and v = — y. Applying the Cauchy-Riemann 
conditions, we obtain 

8u _ < i dv 
8x dy' 

The Cauchy-Riemann conditions are not satisfied and f(z) — z* is not an 
analytic function of z. It is interesting to note that f(z) = z* is continuous, 
thus providing an example of a function that is everywhere continuous but 
nowhere differentiable. 

The derivative of a real function of a real variable is essentially a local 
characteristic, in that it provides information about the function only in a local 
neighborhood — for instance, as a truncated Taylor expansion. The existence 
of a derivative of a function of a complex variable has much more far-reaching 
implications. The real and imaginary parts of our analytic function must 
separately satisfy Laplace’s equation. This is Exercise 6.2.1. Further, our analytic 
function is guaranteed derivatives of all orders, Section 6.4. In this sense the 
derivative not only governs the -local behavior of the complex function, but 
controls the distant behavior as well. 


EXERCISES 

6.2.1 The functions u(x,y) and v(x, y) are the real and imaginary parts, respectively, of 
an analytic function w(z). 

(a) Assuming that the required derivatives exist, show that 

\ 2 U m \ 2 V - 0 . 


Solutions of Laplace’s equation such as u(x, y) and r(x, y) are called harmonic 
functions. 
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(b) Show that 

du du dv dv _ ^ 
dx dy dx dy 

and give a geometric interpretation. 

Hint. The technique of Section 1.6 allows you to construct vectors normal to the 
curve w(x,y) = c, and i;(x,y) — c y 

6 . 2.2 Show whether or not the function f(z) — 0l{z) — x is analytic. 

6 . 2.3 Having shown that the real part w(x, y) and the imaginary part t?(x, y) of an analytic 
function w(z) each satisfy Laplace’s equation, show that w(x,y) and v(x,y) cannot 
have either a maximum or a minimum in the interior of any region in which w(z) 
is analytic. (They can have saddle points.) 

6 . 2.4 Let A = d 2 w/dx 2 , B = d 2 w/dxdy , C = d 2 w/dy 2 . From the calculus of functions 
of two variables, w(x,y), we have a saddle point if 

B 2 - AC > 0. 

With f(z) — u(x,y) + iv(x,y), apply the Cauchy-Riemann conditions and show 
that neither w(x,y) nor i;(x,y) has a maximum or a minimum in a finite region of 
the complex plane. 

6 . 2.5 Find the analytic functions 

w(z) = u{x,y) + iv(x,y) 
if 

(a) u(x,y) = x 3 ~ 3xy 2 , 

(b) v(x , y) — e~ y sin x. 

6 . 2.6 If there is some common region in which Wj = u(x,y) -b m(x,y) and w 2 = wf = 
u(x, y) — iv(x, y) are both analytic, prove that w(x, y) and v(x, y) are constants. 

6 . 2.7 The function /(z) = u(x,y) + iv{x,y) is analytic. Show that /*(z*) is also analytic. 

6 . 2.8 Using f{re lB ) — R(r,0)e i&(r,e) , in which R(r, 0) and 0(r,d) are differentiable real 
functions of r and 0 , show that the Cauchy-Riemann conditions in polar coor- 
dinates become 


8R = R ee 

dr r d9’ 


(b) 


dR = _ R d® 
rdO dr 


Hint. Set up the derivative first with 3z radial and then with 3z tangential. 

6 . 2.9 As an extension of Exercise 6.2.8 show that 0(r, 0) satisfies Laplace’s equation in 
polar coordinates, Eq. 2.33 (without the final term). 

6 . 2.1 0 Two-dimensional irrotational fluid flow is conveniently described by a complex 
potential /(z) = w(x, v) + iv(x,y). We label the real part w(x,y), the velocity poten- 
tial and the imaginary part v(x,y\ the stream function. The fluid velocity V is 
given by V = \u. If /(z) is analytic, 

(a) Show that df/dz — V x — iV y , 

(b) Show that V • V = 0 (no sources or sinks), 

(c) Show that V x V = 0 (irrotational, nonturbulent flow). 
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6.2.1 1 A proof of the Schwarz inequality (Section 9.4) involves minimizing an expression 

f = '!'<,„ + ty ab + + U*\l/ bb > 0. 

The ij /' s are integrals of products of functions; \j/ aa and \p bb are real, i p ab is complex. 
A is a parameter, possibly complex. 

(a) Differentiate the preceding expression with respect to A*, treating A as an 
independent parameter, independent of A*. Show that setting the derivative 
df/dk* equal to zero yields 

A fab- 

(b) Show that df/dk — 0 leads to the same result. 

(c) Let A = x + iy, A* = x — iy. Set the x and y derivatives equal to zero and 
show that again 

= -'I'Zb/Ab- 

This independence of A and A* appears again in Section 17.7. 

6.2.1 2 The function f(z) is analytic. Show that the derivative of f(z) with respect to z* 
vanishes. 

Hint. Use the chain rule and take x = (z + z*)/ 2, y = (z — z*)/2i. 

Note. This result emphasizes that our analytic function f(z) is not just a complex 
function of two real variables x and y. It is a function of the complex variable 
x + iy. 


6.3 CAUCHY'S INTEGRAL THEOREM 

Contour Integrals 

With differentiation under control, we turn to integration. The integral of a 
complex variable over a contour in the complex plane may be defined in close 
analogy to the (Riemann) integral of a real function integrated along the real 
x-axis. 

We divide the contour z 0 z' 0 into n intervals by picking n — 1 intermediate 
points z u z 2 , . . . , on the contour (Figure 6.5). Consider the sum 

S n =t f(Q( z j - Zj-i)> (6.25) 

j = i 

where C; is a point on the curve between Zj and z j _ 1 . Now let n -> oo with 

il-° 

for all j. If the lim M ^ 00 S n exists and is independent of the details of choosing the 
points Zj and Cj, then 

lim t f(Q(zj - Zj- 1 ) = r m dz. (6.26) 

Jz 0 

The right-hand side of Eq. 6.26 is called the contour integral of f(z) (along the 
specified contour C from z — z 0 to z = z' 0 ). 

The preceding development of the contour integral is closely analogous to 
the Riemann integral of a real function of a real variable. As an alternative, 
the contour integral may be defined by 
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X 


FIG. 6.5 


I 


f{z)dz = 


r* 2 >y 2 

x i>yi 

C x 2>yz 

x l ,y 1 


[u(x,y) + iv{x,y )] [dx + idy] 


[ u(x,y)dx — v(x,y)dy ] + i 


K 2>y 2 


[y(x,y)dx + u(x,y)dy], 


with the path joining (^Ci , y x ) and (x 2 ,y 2 ) specified. This reduces the complex 
integral to the complex sum of real integrals. It’s somewhat analogous to the 
replacement of a vector integral by the vector sum of scalar integrals, Section 
1 . 10 . 


Stokes's Theorem Proof 

Cauchy’s integral theorem is the first of two basic theorems in the theory of 
the behavior of functions of a complex variable. First, a proof under relatively 
restrictive conditions — conditions that are intolerable to the mathematician 
developing a beautiful abstract theory but that are usually satisfied in physical 
problems. 

If a function / (z) is analytic (therefore single-valued) and its partial derivatives 
are continuous throughout some simply connected region Rf for every closed 
path C (Fig. 6.6) in R the line integral of /(z) around C is zero or 

f f{z)dz = o f(z)dz = 0. (6.27) 

Jc Jc 


1 A simply connected region or domain is one in which every closed contour 
in that region encloses only the points contained in it. If a region is not simply 
connected, it is called multiply connected. As an example of a multiply con- 
nected region, consider the z-plane with the interior of the unit circle excluded. 
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FIG. 6.6 A closed contour C within 
a simply connected region 


The symbol j> is used to emphasize that the path is closed. The reader will recall 
that in Section 1.13 such a function f(z\ identified as a force, was labeled 
conservative. 

In this form the Cauchy integral theorem may be proved by direct appli- 
cation of Stokes’s theorem (Section 1,12). With f{z) = w(x,y) + /i?(x,y) and 
dz = dx + i dy , 


f(z)dz = <j) (u + iv)(dx -f /dy) 


/* c 

= O (udx — rdy) + i o(vdx + udy). 
Jc * 


(6.28) 


These two line integrals may be converted to surface integrals by Stokes’s 
theorem, a procedure that is justified if the partial derivatives are continuous 
within C. In applying Stokes’s theorem, the reader might note that the final 
two integrals of Eq. 6.28 are completely real. Using 


V = iT x +jI/, 

we have 


(V x dx+V y dy) = 


8y y dy x\j a 


(6.29) 


JC 

For the first integral in the last part of Eq. 6.28 let u = V x and v = — V y . 2 Then 


o 


Jc 


(i udx — vdy) = (b (V x dx + V y dy) 


SV 1 _dV\ 
dx dy ) 


dx dy 


(6.30) 


For the second integral on the right side of Eq. 6.28 we let u = V y and v = V x . 
Using Stokes’s theorem again, we obtain 


2 In the proof of Stokes’s theorem, Section 1.12, V x and V y are any two func- 
tions (with continuous partial derivatives). 
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(vdx + u dy) = 


(du 


e £) ixdy - 


(6.31) 


On application of the Cauchy-Riemann conditions that must hold, since f(z) 
is assumed analytic, each integrand vanishes and 


f(z)dz = - 


fdv 


dx dy + i 



e £) dxdy 


= 0 . 


(6.32) 


Cauchy-Goursat Proof 

This completes the proof of Cauchy’s integral theorem. However, the proof 
is marred from a theoretical point of view by the need for continuity of the first 
partial derivatives. Actually, as shown by Goursat, this condition is not essential. 
An outline of the Goursat proof is as follows. We subdivide the region inside 
the contour C into a network of small squares as indicated in Fig. 6.7. Then 



x FIG. 6.7 Cauchy-Goursat con- 
tours 

<£/(z)rfz = ^f f(z)dz (6.33) 

Jc J Jcj 

all integrals along interior lines canceling out. To attack the j c .f(z)dz , we 
construct the function 


¥ z ’ z j) = 


/(z) - f{Zj) 


Z — Zi 


df(z) 

dz 


(6.34) 


with Zj an interior point of the jth subregion. Note that [ f(z) — /(z y )]/(z — Zj) 
is an approximation to the derivative at z ~ z y Equivalently, we may note that 
if f(z) had a Taylor expansion (which we have not yet proved), then Sj(z 9 Zj) 
would be of order z — z j9 approaching zero as the network was made finer. 
We may make 


\5j(z,Zj}\ < s. 


(6.35) 
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where e is an arbitrarily chosen small positive quantity. 

Solving Eq. 6.34 for /(z) and integrating around C p we obtain 

o /(z)dz=<|> (z - Zj)Sj(z,Zj)dz, (6.36) 

JCj JCj 

the integrals of the other terms vanishing. 3 When Eqs. 6.35 and 6.36 are com- 
bined, one may show that 


Lo 


f{z)dz 


< Ae, 


(6.37) 


where A is a term of the order of the area of the enclosed region. Since s is 
arbitrary, we let s -+ 0 and conclude that: 

If a function /(z) is analytic on and within a closed path C, 


j) f(z)dz = 0. (6.38) 

Details of the proof of this significantly more general and more powerful form 
can be found in Churchill and in the other references cited. Actually we can still 
prove the theorem for /(z) analytic within the interior of C and only continuous 
on C. 

The consequence of the Cauchy integral theorem is that for analytic functions 
the line integral is a function only of its end points, independent of the path of 
integration : 

£f(z)dz = F(z 2 ) - F(z,) = - f(z)dz, (6.39) 

again exactly like the case of a conservative force, Section 1.13. 


Multiply Connected Regions 

The original statement of our theorem demanded a simply connected region. 
This restriction may easily be relaxed by the creation of a barrier, a cut line. 
Consider the multiply connected region of Fig. 6.8, in which /(z) is not defined 
for the interior R'. Cauchy’s integral theorem is not valid for the contour C, 
as shown, but we can construct a contour C for which the theorem holds. 
We cut from the interior forbidden region R f to the forbidden region exterior 
to R and then run a new contour C', as shown in Fig. 6.9. 

The new contour C f through ABDEFGA never crosses the cut line that 
literally converts R into a simply connected region. The three-dimensional 
analog of this technique was used in Section 1.14 to prove Gauss’s law. By 
Eq. 6.39 


r 


f(z)dz = - 


f(z)dz. 


(6.40) 


3 jdz and §zdz = 0. 
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FIG. 6.8 A closed contour C in 
a multiply connected region 


FIG. 6.9 Conversion of a mul- 
tiply connected region into a sim- 
ply connected region 


f(z ) having been continuous across the cut line and line segments DE and GA 
arbitrarily close together. Then 


f(z)dz = f(z) dz + 


f(z)dz 


EFG 


= 0 


(6.41) 


by Cauchy’s integral theorem, with region R now simply connected. Applying 
Eq. 6.39 once again with ABD C[ and EFG — C' 2 , we obtain 

<j> f(z)dz=L f(z)dz , (6.42) 

Jc[ JC' 2 

in which C\ and C 2 are both traversed in the same (counterclockwise) direction. 

It should be emphasized that the cut line here is a matter of mathematical 
convenience, to permit the application of Cauchy’s integral theorem. Since f(z) 
is analytic in the annular region, it is necessarily single-valued and continuous 
across any such cut line. When we consider branch points (Section 7.1) our 
functions will not be single-valued and a cut line will be required to make them 
single-valued. 


EXERCISES 

6.3.1 Show that j z z 2 i f(z)dz == 

6.3.2 In the Goursat proof of Cauchy’s integral theorem we take 



CAUCHY'S INTEGRAL FORMULA 371 


zdz = 0. 


Show that this expression holds, taking the path of integration to be the unit 
circle, Izl = 1. 


6 . 3.3 Prove that 


f{z)dz\ < \f\ 


where |/| max is the maximum value of |/(z)| along the contour C and L is the length 
of the contour. 


6 . 3.4 Verify that 


depends on the path by evaluating the integral for the two paths shown in Fig. 6.10. 
Recall that f(z) — z* is not an analytic function of z and that Cauchy’s integral 
theorem therefore does not apply. 



FIG. 6.10 


6 . 3.5 Show that 


in which the contour C is a circle defined by |z| = R > 1. 

Hint. Direct use of the Cauchy integral theorem is illegal. Why? The integral may 
be evaluated by transforming to polar coordinates and using tables. The preferred 
technique would be the calculus of residues, Section 7.2. This yields 0 for R > 1 
and 2ni for R < 1. 


6.4 CAUCHY'S INTEGRAL FORMULA 


As in the preceding section, we consider a function f(z) that is analytic on 
a closed contour C and within the interior region bounded by C. We seek to 
prove that 


~dz = 2nif(z 0 \ 
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in which z 0 is some point in the interior region bounded by C. This is the second 
of the two basic theorems mentioned in Section 6.2. Note carefully that since 
z is on the contour C while z 0 is in the interior, z — z 0 ^ 0 and the integral 
Eq. 6.43 is well defined. 



FIG. 6. 1 1 Exclusion of a singular point 


Although /(z) is assumed analytic, the integrand is /(z)/(z — z 0 ) and is not 
analytic at z = z 0 . If the contour is deformed as shown in Fig. 6.11 (or Fig. 6.9, 
Section 6.3), Cauchy’s integral theorem applies. By Eq. 6.42 


t^-dz-L 

Jc z ~ z o Jc 


-dz = 0, 
z - z 0 


(6.44) 


where C is the original outer contour and C 2 is the circle surrounding the point 
z 0 traversed in a counterclockwise direction. Let z = z 0 -F re 10 , using the polar 
representation because of the circular shape of the path around z 0 . Here r is 
small and will eventually be made to approach zero. We have 



r 

Jc 2 re 


Taking the limit as r 0, we obtain 


o JO- dz = if(z 0 ) f do 
Jc 2 z z ° Jc 2 (6.45) 

= 2nif(z 0 ), 


since /(z) is analytic and therefore continuous at z = z 0 . This proves the Cauchy 
integral formula. 

Here is a remarkable result. The value of an analytic function f(z) is given 
at an interior point z — z 0 once the values on the boundary C are specified. 
This is closely analogous to a two-dimensional form of Gauss’s law (Section 1.14) 
in which the magnitude of an interior line charge would be given in terms of 
the cylindrical surface integral of the electric field E. 

A further analogy is the determination of a function in real space by an 
integral of the function and the corresponding Green’s function (and their 
derivatives) over the bounding surface. Kirchhoff diffraction theory is an 
example of this. 

It has been emphasized that z 0 is an interior point. What happens if z 0 is 
exterior to C? In this case the entire integrand is analytic on and within C. 
Cauchy’s integral theorem, Section 6.3, applies and the integral vanishes. We 
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have 

1 f f(z)dz = f/(z 0 ), 2 o interior 

2ni j c z — z 0 1 0, z 0 exterior. 


Derivatives 

Cauchy’s integral formula may be used to obtain an expression for the 
derivative of /(z). From Eq. 6.43, with f(z) analytic, 


f(z 0 + Sz 0 ) - /(zp) = 1 

Sz 0 2niSz 0 


f(z) 


z ~~ z 0 ~ 5z 0 
Then, by definition of derivative (Eq. 6.14), 

i r szpfjz) 


dz — 


f'(z 0 )= lim 


<5 2o -o 2niSz 0 J (z - z 0 - Sz 0 )(z - z 0 ) 


JM 

z -z 0 


dz 


dz 


m 


i 


2ni T(z - z 0 ) ; 


(6.46) 


dz . 


The alert reader will see that this result could have been obtained by differ- 
entiating Eq. 6.43 under the integral sign with respect to z 0 . This formal or 
turning-the-crank approach is valid, but the justification for it is contained in 
the preceding analysis. 

This technique for constructing derivatives may be repeated. We write 
f'(z 0 + Sz 0 ) and f'(z 0 \ using Eq. 6.46. Subtracting, dividing by dz 0 , and finally 
taking the limit as Sz 0 0, we have 


/ ,2, (*o) = 


2ni 


fjz) dz 
(z~z 0 ) 3 ‘ 


Note that f (2 \z 0 ) is independent of the direction of <5z 0 as it must be. Continuing, 
we get 


f in \z 0 ) = 


ft] f Rz)dz A 
2ni * (z — z 0 ) n+l ’ 


(6.47) 


that is, the requirement that /(z) be analytic not only guarantees a first derivative 
but derivatives of all orders as well ! The derivatives of /(z) are automatically 
analytic. The reader should notice that this statement assumes the Goursat 
version of the Cauchy integral theorem. This is why Goursat’s contribution is so 
significant in the development of the theory of complex variables. 


Morera's Theorem 

A further application of Cauchy’s integral forpiula is in the proof of Morera’s 
theorem, which is the converse of Cauchy’s integral theorem. The theorem states 


^his expression is the starting point for defining derivatives of fractional 
order . See A. Erdelyi, et al., Tables of Integral Transforms , Vol. 2, New York: 
McGraw-Hill (1954). For recent applications to mathematical analysis see 
T. J. Osier, “An integral analogue of Taylor’s series and its use in computing 
Fourier transforms.” Math . Computation 26 , 449 (1972) and his references. 
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the following: 

If a function f(z) is continuous in a simply connected region R and j c f{z)dz = 0 
for every closed contour C within R , then f(z) is analytic throughout R. 

Let us integrate f(z) from z t to z 2 . Since every closed path integral of f(z) 
vanishes, the integral is independent of path and depends only on its end points. 
We label the result of the integration F(z), with 

F(z 2 ) - F(z A ) - f 2 f(z)dz. (6.48) 

Jzj 


As an identity, 


F(z 2 )-F(z % )_ fizi) 


IV£M-A2j\dt 

Z 2 — Z\ 


(6.49) 


using t as another complex variable. Now we take the limit as z 2 -► z v 

lz?C/W-/(zi)] dt 
lim = 0, 


z 2 - Zl 


since f(t) is continuous. 2 Therefore 

lim Ffe)-F(z,) _ f ' W 


= /(z l) 


(6.50) 


(6.51) 


by definition of derivative (Eq. 6.14). We have proved that F'(z) at z = z x exists 
and equals /(zj. Since z x is any point in jR, we see that F(z) is analytic. Then by 
Cauchy’s integral formula (compare Eq. 6.47) F f (z) = /(z) is also analytic, 
proving Morera’s theorem. 

Drawing once more on our electrostatic analog, we might use /(z) to represent 
the electrostatic field E. If the net charge within every closed region in R is zero 
(Gauss’s law), the charge density is everywhere zero in R. Alternatively, in terms 
of the analysis of Section 1.13,/(z) represents a conservative force (by definition 
of conservative), and then we find that it is always possible to express it as the 
derivative of a potential function F(z). 


EXERCISES 


6.4.1 Show that 


j> (z - z 0 )"dz = 


n = - 1, 
n f — 1, 


where the contour C encircles the point z ~ z 0 in a positive (counterclockwise) 
sense. The exponent n is an integer. 

The calculus of residues, Chapter 7, is based on this result. 


2 We can quote the mean value theorem of calculus here. 
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6 . 4.2 Show that 

<p z m ~ n ~ 1 dz , m and n integers 

2m J 

(with the contour encircling the origin once counterclockwise), is a representation 
of the Kronecker delta S m 

6 . 4.3 Solve Exercise 6.3.5 by separating the integrand into partial fractions and then 
applying Cauchy’s integral theorem for multiply connected regions. 

Note. Partial fractions are explained in Section 15.7 in connection with Laplace 
transforms. 


6 . 4.4 


6 . 4.5 


Evaluate 



where C is the circle \z\ = 2. 


Assuming that /(z) is analytic on and within a closed contour C and that the 
point z 0 is within C, show that 


f'(z) 
(z “ z 0 ) 


dz ~ 


m 

(Z - Z 0 ) 2 


dz. 


6 . 4.6 You know that f(z) is analytic on and within a closed contour C. You suspect 
that the nth derivative f {n) (z 0 ) is given by 


f in) (z Q ) 


2ni 




f(z) 

(z - z 0 r i 


dz. 


Using mathematical induction, prove that this expression is correct. 


6 . 4.7 Show that 

|/ (n) (Zo)| < 

where R is the radius of a circle centered at z = z 0 and M is the maximum value 
of |/(z)| on that circle. Assume that f(z) is analytic on and within the circle. 

6 . 4.8 If f(z) is analytic and bounded [|/(z)| < M, a constant] for all z, show that/(z) 
must be a constant. This is Liouville’s theorem. 


6 . 4.9 Fundamental theorem of algebra. As a corollary of Liouville’s theorem, Exercise 
6.4.8, show that every polynomial equation, 

P(z) = a 0 4- a { z + • • • 4- a„z n = 0 

has at least one root. Here n > 0 and a n ^ 0. 

Hint. Consider /(z) = 1/P(z). 

Note. Once the preceding result is established we can divide out the root and 
repeat the process for the resulting (n — l)-degree polynomial. This leads to the 
conclusion that P(z) has exactly n roots. 

6 . 4.1 0 (a) A function /(z) is analytic within a closed contour C (and continuous on C). 

If/(z) ^ 0 within C and |/(z)| > M on C, show that 

|/(z)| 2: M 

for all points within C. 

Hint. Consider w(z) = l//(z). 
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(b) If f(z) = 0 within the contour C, show that the foregoing result does not 
hold, that it is possible to have \f{z)\ — 0 at one or more points in the interior 
with \f(z)\ > 0 over the entire bounding contour. Cite a specific example of 
an analytic function that behaves this way. 


6 . 4.11 Using the Cauchy integral formula for the nth derivative, convert the following 
Rodrigues formulas into the corresponding Schlaefli integrals. 

(a) Legendre 


P„(x) = ~ ~(x 2 - 1)". 
2 n nl dx n 


(b) Hermite 


ANS . 


Lz 11. ± 

2" 2 ni 


u - 

(z - x)" +1 


dz. 




(c) Laguerre 


e x d n 

LnM = L .j-M n e~ x ) 
n \ dx n 

Note. From the Schlaefli integral representations one can develop generating 
functions for these special functions. Compare Sections 12.4, U ,nd 13.2. 


6.5 LAURENT EXPANSION 

Taylor Expansion 

The Cauchy integral formula of the preceding section opens up the way for 
another derivation of Taylor’s series (Section 5.6), but this time for functions of a 
complex variable. Suppose we are trying to expand f(z) about z = z 0 and we have 
z = z x as the nearest point on the Argand diagram for which /(z) is not analytic. 
We construct a circle C centered at z = z 0 with radius |z' — z 0 | < | z x — z 0 | (Fig. 
6.12). Since z x was assumed to be the nearest point at which /(z) was not analytic, 
/(z) is necessarily analytic on and within C. 



FIG. 6.12 
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From Equation 6.43, the Cauchy integral formula, 




f(z')dz' 


1 


2niJ(z' - z 0 ) ~ (z ~ z 0 ) 


(6.52) 


_1_ 

2?ii 


f(z')dz' 


c {z' - z 0 )[l - (z - z 0 )/(z' - z 0 )]' 


Here z' is a point on the contour C and z is any point interior to C. It is not quite 
rigorously legal to expand the denominator of the integrand in Eq. 6.52 by the 
binomial theorem, for we have not yet proved the binomial theorem for complex 
variables. Instead, we note the identity 


-J— = 1 + t + t 2 + t 3 + ■ • • = f t”, (6.53) 

1 - t n =o 

which may easily be verified by multiplying both sides by 1 — f. The infinite 
series, following the methods of Section 5.2, is convergent for |t| < 1. 

Now for point z interior to C, |z — z 0 | < | z' — z 0 |, and, using jl 6.53, Eq. 
6.52 becomes 



L V ( Z z z 0 Tf{z')dz' 
(z'-z 0 r 1 


(6.54) 


Interchanging the order of integration and summation (valid since Eq. 6.53 is 
uniformly convergent for |t| < 1), we obtain 


f(z)- 1 fh z)4 f[z ' )dz ' 

m ~2nih iZ JJz'-Zor 1 ' 


Referring to Eq. 6.47, we get 


f(z) = y (z - zj 


n~ 0 


, /^q) 

n\ 


(6.55) 


(6.56) 


which is our desired Taylor expansion. Note that it is based only on the assump- 
tion that /(z) is analytic for |z — z 0 | < | z x — z 0 |. Just as for real variable power 
series (Section 5.7), this expansion is unique for a given z 0 . 

From the Taylor expansion for /(z) a binomial theorem may be derived — 
Exercise 6.5.2. 


Schwarz Reflection Principle 

From the binomial expansion of g(z) — (z — x 0 ) n for integral n it is easy to 
see that the complex conjugate of the function is the function of the complex 
conjugate, 


g*(z) = (z - x 0 )"* = ( z * - x 0 y = g(z*). 


(6.57) 
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This leads us to the Schwarz reflection principle: 

If a function f(z) is (1) analytic over some region including the real axis and (2) 
real when z is real , then 

f*(z) =f(z*). (6.58) 

(see Fig. 6.13). 

Expanding f(z) about some (nonsingular) point x 0 on the real axis, 

m-U *>•*» 

h=0 • 

by Eq. 6.56. Since/(z) is analytic at z = x 0 , this Taylor expansion exists. Since 
f(z) is real when z is real ,/ (M) (x 0 ) must be real for all n. Then when we use Eq. 
6.57, Eq. 6.58, the Schwarz reflection principle, follows immediately. Exercise 
6.5.6 is another form of this principle. 

Analytic Continuation 

In the foregoing discussion we assumed that /(z) has an isolated nonanalytic 
or singular point at z = z l (Fig. 6.12). For a specific example of this behavior 
consider 


f(z) = — j-, (6.60) 

1 + Z 

which becomes infinite at z = — 1. Therefore /(z) is nonanalytic at z x = — 1 or 
z x — — 1 is our singular point. By Eq. 6.56 or the binomial theorem for complex 
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y 



functions that follows directly from it, 

I oo 

r ±- = i-z + z 2 -z 3 +--- = z t (-irz", t , »i) 

convergent for |z| < 1. If we label this circle of convergence C t , Eq. 6.61 holds for 
/(z) in the interior of C, which we label region S 1 . 

The situation is that f(z ) expanded about the origin holds only in Si (and on 
C i excluding z x = — 1), but we know from the form of f(z) that it is well defined 
and analytic elsewhere in the complex plane outside Sj . Analytic continuation 
is a process of extending the region in which a function such as the series in Eq. 
6.61 is defined. For instance, suppose we expand/(z) about the point z = i. We 
have 


/(z) = 


1 

1 + z 


1 

1 + i 4- (z — i) 


By Eq. 6.56 again or 6.62 


1 

1 + i 



(6.62) 


/(z) : 


1 + i 


z — i 
1 + i 


+ 



(6.63) 


convergent for |z — i\ < |1 + i\ = ^fl . . Our circle of convergence is C 2 and the 
region bounded by C 2 is labeled S 2 (Fig. 6.14). Now /(z) is defined by the expan- 
sion (Eq. 6.68) for S 2 , which overlaps and extends out further in the complex 
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plane. 1 This extension is an analytic continuation, and when we have only 
isolated singular points to contend with, the function can be extended in- 
definitely. Equations 6.60, 6.61, and 6.63 are three different representations of 
the same function. Each representation has its own domain of convergence. 
Equation 6.61 is a Maclaurin series. Equation 6.63 is a Taylor expansion about 
z = i and from the following paragraphs Eq. 6.60 is seen to be a one-term 
Laurent series. 

Analytic continuation may take many forms and the series expansion just 
considered is not necessarily the most convenient technique. As an alternate 
technique we shall use a recurrence relation in Section 10. 1 to extend the factorial 
function around the isolated singular points, z = — n, n = 1, 2, 3 • * * . As another 
example, the hypergeometric equation is satisfied by the hypergeometric func- 
tion defined by the series, Eq. 13.114, for \z\ < 1. The integral representation 
given in Exercise 13.5.8 permits a continuation over the entire complex plane. 


Permanence of Algebraic Form 

All our elementary functions, e\ sin z, and so on can be extended into the 
complex plane (compare Exercise 6.1.9). For instance, they can be defined by 
power-series expansions such as 


- 1 + fi + h + 


= y - 


n = 0 


n ! 


(6.64) 


for the exponential. Such definitions agree with the real variable definition 
along the real x-axis and literally constitute an analytic continuation of the 
corresponding real functions into the complex plane. This result is often called 
permanence of the algebraic form. 


Laurent Series 

We frequently encounter functions that are analytic in an annular region, say, 
of inner radius r and outer radius R, as shown in Fig. 6. 15. Drawing an imaginary 
cut line to convert our region into a simply connected region, we apply Cauchy’s 
integral formula, and for two circles, C 2 and C u centered at z = z 0 and with 
radii r 2 and r 1? respectively, where r < r 2 < r t < R, we have 2 


/(z) = 


2ni 


f f{z')dz' 
Jc, 2 '~ 2 


J_ f f(z’)dz' 
2ni Jc 2 2 ' - 2 ’ 


(6.65) 


1 One of the most powerful and beautiful results of the more abstract theory 
of functions of a complex variable is that if two analytic functions coincide 
in any region, such as the overlap of Si and S 2 , or coincide on any line segment, 
they are the same function in the sense that they will coincide everywhere as 
long as they are both well defined. In this case the agreement of the expansions 
(Eqs. 6.61 and 6.63) over the region common to Si and S 2 would establish 
the identity of the functions these expansions represent. Then Eq. 6.63 would 
represent an analytic continuation or extension of f(z ) into regions not 
covered by Eq. 6.61. We could equally well say that f(z) = 1/(1 + z) is itself 
an analytic continuation of either of the series given by Eqs. 6.61 and 6.63. 

2 We may take r 2 arbitrarily close to r and arbitrarily close to R , maximizing 
the area enclosed between C 1 and C 2 ■ 
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FIG. 6.15 | z' — z 0 | Ci > \z — z 0 |; [z' — z 0 \c 2 < \z — z 0 j 


Note carefully that in Eq. 6.65 an explicit minus sign has been introduced so that 
contour C 2 (like C t ) is to be traversed in the positive (counterclockwise) sense. 
The treatment of Eq. 6.65 now proceeds exactly like that of Eq. 6.62 in the 
development of the Taylor series. Each denominator is written as ( z' — z 0 ) — 
(z — z 0 ) and expanded by the binomial theorem which now follows from the 
Taylor series (Eq. 6.56). 

Noting that for C l9 | z' — z 0 1 > \z — z 0 1 while for C 2 , \z' — z 0 1 < \z — z 0 1, we 
find 


1 00 

f(z) = ^ y (z 
2ni n tb K 


z 0 Y 


f(z')dz' 
(z' - z 0 f +1 


+ y'. Z (2 - z 0 ) " ^ (z' - Zo )” '/(z')^z'. 


The minus sign of Eq. 6.65 has been absorbed by the binomial expansion. 
Labeling the first series S x and the second S 2 , 


Si 


~ t (z - z 0 r 

Ini n=0 


f{z')dz' 
(z 1 - z 0 )" + 


(6.67) 


which is the regular Taylor expansion, convergent for \z — z 0 | < | z’ — z 0 1 = 
that is, for all z interior to the larger circle, C x . For the second series in Eq. 6.66 
we have 


S 2 


2ni 


oo 


E ( z - z o) " 



Z 0 )" 7(z')dz' 


(6.68) 
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convergent for |z — z 0 | > |z' — z 0 | = r 2 , that is, for all z exterior to the smaller 
circle C 2 . Remember, C 2 now goes counterclockwise. 

These two series may be combined into one series 3 (a Laurent series) by 

/(z)= £ a„(z~z 0 )\ (6.69) 


where 


a 


n 


2m' 


£ 


f(z')dz' 

(z' - z 0 r +1 


(6.70) 


Since, in Eq. 6.70, convergence of a binomial expansion is no longer a problem, 
C may be any contour within the annular region r < \z — z 0 \ < R encircling z 0 
once in a counterclockwise sense. If we assume that such an annular region of 
convergence does exist, Eq. 6.69 is the Laurent series or Laurent expansion of 
f(z\ 

The use of the cut line (Fig. 6.15) is convenient in converting the annular 
region into a simply connected region. Since our function is analytic in this 
annular region (and therefore single-valued), the cut line is not essential and, 
indeed, does not appear in the final result, Eq. 6.70. In contrast to this, functions 
with branch points must have cut lines — Section 7.1. 

Laurent series coefficients need not come from evaluation of contour inte- 
grals (which may be very intractable). Other techniques such as ordinary series 
expansions may provide the coefficients. 

Numerous examples of Laurent series appear in Chapter 7. We limit our- 
selves here to one simple example to illustrate the application of Eq. 6.69. 


EXAMPLE 6.5.1 


Let f(z) = [z(z — 1)] If we choose z 0 = 0, then r = 0 and R = 1, f(z) 
diverging at z = 1. From Eqs. 6.70 and 6.69 


1 C dz' 

2ni J (z') n+2 (z r - 1) 


-1 

2ni 


I (z7 


m=0 


dz ' 

(z') n+2 ' 


(6.71) 


Again, interchanging the order of summation and integration (uniformly con- 
vergent series), we have 


<■— aZ 


m — 0 


dz' 


(*7 


ji + 2 -m ’ 


(6.72) 


If we employ the polar form, as in Eq. 6.47 (or compare Exercise 6.4.1), 


3 Replace n by — n in S 2 and add. 
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1 


1 


rie w dO 


27li m % T r n+2~m e i( n + 2~m)e 


. 2ni Yj 3n+2-m,l‘ 
2rci 

In other words, 

- 1 for n > — 1, 
0 for n < — 1. 

The Laurent expansion (Eq. 6.69) becomes 
1 1 


a„ = 


z(z — 1) z 


— 1— z — z 2 — z — 


= - I z "- 

■> n = —1 


(6.73) 


(6.74) 


(6.75) 


For this simple function the Laurent series can, of course, be obtained by a 
direct binomial expansion. 

The Laurent series differs from the Taylor series by the obvious feature of 
negative powers of (z — z 0 ). For this reason the Laurent series will always 
diverge at least at z = z 0 and perhaps as far out as some distance r (Fig. 6.15). 


EXERCISES 


6.5.1 

6.5.2 


Develop the Taylor expansion of ln(l + z). 
Derive the binomial expansion 


ans. £i-i r lL - 


(1 + z) m = 1 + mz + 


m(m • 


1-2 


-W- 


: I 

n = 0 


6.5.3 


for m any real number. The expansion is convergent for \z\ < 1. 

A function /(z) is analytic on and within the unit circle. Also, |/(z)| < 1 for |z| < 1 
and /( 0) = 0. Show that |/(z)| < |z| for \z\ < 1. 

Hint . One approach is to show that f(z)/z is analytic and then express [/(z 0 )/z 0 ] n 
by the Cauchy integral formula. Finally, consider absolute magnitudes and take 
the nth root. This exercise is sometimes called Schwarz’s theorem. 


6.5.4 If /(z) is a real function of the complex variable z and the Laurent expansion 
about the origin, /(z) = £ a n z "’ has a„ = 0 for n < — N, show that all of the 
coefficients, a n , are real. 

6.5.5 A function /(z) = u(x,y) + m(x,y) satisfies the conditions for the Schwarz reflec- 
tion principle. Show that 
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(a) u is an even function of y. 

(b) v is an odd function of y. 

6.5.6 A function f(z) can be expanded in a Laurent series about the origin with the 
coefficients a n real. Show that the complex conjugate of this function of z is the 
same function of the complex conjugate of z; that is, 

/*(z)=/(z*). 

Verify this explicitly for 

(a) f(z) — z”, n an integer, 

(b) /(z) = sin z. 

If /(z) = iz, (a l = i), show that the foregoing statement does not hold. 

6.5.7 The function /(z) is analytic in a domain that includes the real axis. When z is 
real (z = x), /(x) is pure imaginary. 

(a) Show that 

f(z*) = -[/(z)]*. 

(b) For the specific case f(z) = iz, develop the cartesian forms of /(z), /(z*), and 
/*(z). Do not quote the general result of part (a). 

6.5.8 Develop the first three nonzero terms of the Laurent expansion of 

f(z) = (e z -iy 1 

about the origin. Notice the resemblance to the Bernoulli number generating 
function, Eq. 5.144 of Section 5.9. 

6.5.9 Prove that the Laurent expansion of a given function about a given point is 
unique; that is, if 

OO 00 

/(z) = £ a„(z - z 0 y = £ b„(z - z 0 f, 

n= —N n=-N 

show that a n — b n for all n. 

Hint. Use the Cauchy integral formula. 


6.5.10 (a) Develop a Laurent expansion of /(z) = [z(z — l)] 1 about the point z = 1 
valid for small values of |z — 1|. Specify the exact range over which your 
expansion holds. This is an analytic continuation of Eq. 6.75. 

(b) Determine the Laurent expansion of /(z) about z = 1 but for | z — 1 1 large. 


6.5.11 (a) Given ffz) = e~ zt dt (with t real), show that the domain in which j\ (z) 

exists (and is analytic) is 0t{z) > 0. 

(b) Show that f 2 (z) = 1/z equals f { (z) over 0l{z) > 0 and is therefore an analytic 
continuation of f x (z) over the entire z-plane except for z = 0. 

(c) Expand 1/z about the point z = i. You will have / 3 (z) = 2?=o a„(z — Q". 
What is the domain of / 3 (z)? 

ANS. - = -i £ (i)"(z - if, \z — i| < 1 

Z M = 0 


6.6 MAPPING 

In the preceding sections we have defined analytic functions and developed 
some of their main features. From these developments the integral relations of 
Chapter 7 follow directly. Here we introduce some of the more geometric aspects 
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of functions of complex variables, aspects that will be useful in visualizing the 
integral operations in Chapter 7 and that are valuable in their own right in 
solving Laplace’s equation in two-dimensional systems. 

In ordinary analytic geometry we may take y = f(x) and then plot y versus x. 
Our problem here is more complicated, for z is a function of two variables x and 
y. We use the notation 

w = /(z) = u(x, y) + iv(x, y). (6.76) 

Then for a point in the z-plane (specific values for x and y) there may correspond 
specific values for w(x,y) and i?(x,y) which then yield a point in the w-plane. As 
points in the z-plane transform or are mapped into points in the w-plane, lines 
or areas in the z-plane will be mapped into lines or areas in the w-plane. Our 
immediate purpose is to see how lines and areas map from the z-plane to the 
w-plane for a number of simple functions. 

Translation 


w = z + z 0 . (6.77) 

The function w is equal to the variable z plus a constant, z 0 = x 0 + iy 0 . By Eqs. 
6.1 and 6.76 


U = X 4- x 0 , 

v = y + y 0 , 

representing a pure translation of the coordinate axes as shown in Fig. 6.16 


(6.78) 




Rotation 

w = zz 0 . (6.79) 

Here it is convenient to return to the polar representation, using 

w = pe i<p , z = re ie , and z 0 = r 0 e ie °, (6.80) 

then 
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pe* = rr 0 e ii6+e »> 


(6.81) 


or 


P = "o, 

= 0 + 0 Q . 


(6.82) 


Two things have occurred. First, the modulus r has been modified, either 
expanded or contracted, by the factor r 0 . Second, the argument 0 has been 
increased by the additive constant 0 O (Fig. 6.17). This represents a rotation 
the complex variable through an angle 0 O . For the special case of z 0 — i, we 
have a pure rotation through njl radians. 


Inversion 


1 

w = ~. 
z 


Again, using the polar form, we have 


pe* = As = 

re r 


which shows that 



<p = —6. 


(6.83) 


(6.84) 


(6.85) 


The first part of Eq. 6.85 shows that inversion clearly. The interior of the unit 
circle is mapped onto the exterior and vice versa (Fig. 6.18). In addition, the 
second part of Eq. 6.85 shows that the polar angle is reversed in sign. Equation 
6.83 therefore also involves a reflection of the y-axis exactly like the complex 
conjugate equation. 

To see how lines in the z-plane transform into the w-plane, we simply return 
to the cartesian form: 


u + iv — . 

x + iy 


( 6 . 86 ) 
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y v 




FIG. 6.18 Inversion 

Rationalizing the right-hand side by multiplying numerator and denominator 
by z* and then equating the real parts and the imaginary parts, we have 


x 2 + y 2 ’ 


x = 


2 , 2 9 
ur + ir 


v = — 


X 2 + y 2 ’ 


y ~ u 2 + v 2 ' 


A circle centered at the origin in the z-plane has the form 

x 2 + y 2 = r 2 

and by Eq. 6.87 transforms into 


+ 


= r 


(u 2 + i; 2 ) 2 ' (i u 2 + v 2 ) 2 
Simplifying Eq. 6.89, we obtain 

u 2 + v 2 = \ = p 2 , 
r 

which describes a circle in the w-plane also centered at the origin. 
The horizontal line y = c 1 transforms into 


— v 


u + v 


= Ci 


16.87) 


( 6 . 88 ) 


(6.89) 


(6.90) 


(6.91) 


or 


u 2 + v 2 


v 1 
+ Cl (2 Cl ) 2 


1 

(2 ci) 2 ’ 


(6.92) 
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y 


y — 

12 3 4 







FIG. 6.19 Inversion, line circle 

which describes a circle in the w-plane of radius (1/2 )c 1 and centered at u = 0, 
v= -ic x (Fig. 6.19). 

The reader may pick up the other three possibilities, x= ±c u y = — c x , by 
rotating the xy-axes. In general, any straight line or circle in the z-plane will 
transform into a straight line or a circle in the w-plane (compare Exercise 6.6. 1). 

The three transformations just discussed have all involved one-to-one corre- 
spondence of points in the z-plane to points in the w-plane. Now to illustrate 
the variety of transformations that are possible and the problems that can 
arise, we introduce first a two-to-one correspondence and then a many-to-one 
correspondence. Finally, we take up the inverses of these two transformations. 
Consider first the transformation 

w = z 2 , (6.93) 

which leads to 

p — r 2 , cp = 20. (6.94) 

Clearly, our transformation is nonlinear, for the modulus is squared, but the 
significant feature of Eq. 6.94 is that the phase angle or argument is doubled. 
This means that the 

first quadrant of z, 0 < 0 < ^ -> upper half-plane of w, 0 < cp < n, 

upper half-plane of z, 0 < 0 < n -> whole plane of w, 0 < q> < 2 tc. 

The lower half-plane of z maps into the already covered entire plane of w, thus 
covering the w-plane a second time. This is our two-to-one correspondence, 
two distinct points in the z-plane, z 0 and z 0 e in = — z 0 , corresponding to the 
single point w = z%. 
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In cartesian representation 


leading to 


u + iv = (x + iy) 2 

— x 2 — y 2 + i2xy, 


u = x 2 — y 2 , 
v = 2 xy. 


(6.95) 


(6.96) 


Hence the lines u = c l9 v = c 2 in the w-plane correspond to x 2 — y 2 = c l9 
2 xy = c 2 , rectangular (and orthogonal) hyperbolas in the z-plane (Fig. 6.20). 
To every point on the hyperbola x 2 - y 2 = c 1 in the right half-plane, x > 0, 
one point on the line u — c l corresponds and vice versa. However, every point 
on the line u = c t also corresponds to a point on the hyperbola x 2 — y 2 = c x 
in the left half-plane, x < 0, as already explained. 



FIG. 6.20 Mapping — hyperbolic coordinates 

It will be shown in section 6.7 that if lines in the w-plane are orthogonal the 

corresponding lines in the z-plane are also orthogonal, as long as the transfor- 

mation is analytic. Since u — c x and v = c 2 are constructed perpendicular to 
each other, the corresponding hyperbolas in the z-plane are orthogonal. We 
have literally constructed a new orthogonal system of hyperbolic lines (or 
surfaces if we add an axis perpendicular to x and y ). Exercise 2.1.3 was an 
analysis of this system. It might be noted that if the hyperbolic lines are electric 
or magnetic lines of force, then we have a quadrupole lens useful in focusing 
beams of high energy particles. 

The transformation 

w = e z (6.97) 

leads to 

pe i<p = e x+iy (6.98) 


or 


P = e\ 

<p = y- 


(6.99) 
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y 



FIG. 6.21 A cut line 


If y ranges from 0 < y < 2n (or — n < y < n\ then <p covers the same range. 
But this is the whole w-plane. In other words, a horizontal strip in the z-plane 
of width 2n maps into the entire w-plane. Further, any point x + i(y + 2nn\ 
in which n is any integer, maps into the same point (by Eq. 6.99), in the w-plane. 
We have a many-(infinitely many)-to-one correspondence. 

The inverse of the fourth transformation (Eq. 6.93) is 

w = z 1/2 . (6.100) 

From the relation 

pe i(p = r ll2 e iel2 , (6.101) 

and 


2<p = 0 , (6.102) 

we now have two points in the w-plane (arguments cp and q> + n) corresponding 
to one point in the z-plane (except for the point z = 0). Or, to put it another way, 
9 and 9 + 2n correspoi 1 to cp and cp + n, two distinct points in the w-plane. 
This is the compler ^ analog of the simple real variable equation y 2 = x, 
in which two values of y, plus and minus, correspond to each value of x. 

The important point here is that we can make the function w of Eq. 6.100 a 
single- valued function instead of a double-valued function if we agree to restrict 
9 to a range such as 0 < 9 < 2n. This may be done by agreeing never to cross 
the line 9 = 0 in the z-plane (Fig. 6.21). Such a line of demarcation is called a 
cut line. The point of termination (z = 0, here) in a multivalued function is 
known as a branch point. It is a form of a singular point (compare Section 7.1), 
/(z) not being analytic at z = 0. 

Any line running from z = 0 out to infinity would serve equally well. The 
purpose of the cut line is to restrict the argument of z. The points z 0 and z 0 e 2ni 
coincide in the z-plane but yield different points w and we 171 = — w in the w-plane. 
Hence in the absence of a cut line the function w = z 1/2 is ambiguous. 

We shall encounter branch points and cut lines frequently in Chapter 7. 

Finally, as the inverse of the fifth transformation (Eq. 6.97), we have 


w = lnz. 


(6.103) 
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By expanding it, we obtain 

u -h iv = 1 nre ie 

= In r + i9. 


(6.104) 


For a given point z 0 in the z-plane the argument 0 is unspecified within an 
integral multiple of 2n. This means that 

v = 0 4- 2mr, (6.105) 

and as in the exponential transformation, we have an infinitely many-to-one 
correspondence. 

Equation 6.103 has a nice physical representation. If we go around the unit 
circle in the z-plane, r — 1, and by Eq. 6.104 u = lnr = 0; but v — 0, and 0 is 
steadily increasing and continues to increase as 9 continues, past 2 n. The 
behavior in the w-plane as we go around and around the unit circle in the z-plane 
is like the advance of a screw as it is rotated or the ascent of a person walking 
up a spiral staricase (Fig. 6.22). 

As in the preceding example, we make the correspondence unique (and 
Eq. 6.103 unambiguous) by restricting 9 to a range such as 0 < 9 < 2n by 
.aking the line 0 = 0 (positive real axis) as a cut line. This is equivalent to 
taking one and only one complete turn of the spiral staircase. 

It is because of the multivalued nature of In z that the contour integral 

= 2ni , ^ 0, 

integrating about the origin. This property appears in Exercises 6.4.1 and 6.4.2 
and is the basis for the entire calculus of residues (Chapter 7). 

The concept of mapping is a very broad and useful one in mathematics. 
Our mapping from a complex z-plane to a complex w-plane is a simple generali- 
zation of one definition of function : a mapping of x (from one set) into y in a 
second set. A more sophisticated form of mapping appears in Section 8.7 where 
we use the Dirac delta function <5(x — a ) to map a function /(x) into its value 
at the point a. Then in Chapter 15 integral transforms are used to map one 
function /(x) in x-space into a second (related) function F(t) in f-space. 
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EXERCISES 

6 . 6.1 How do circles centered on the origin in the z-plane transform for 

(a) w,(z) = 2 4- (b) w 2 (z) = z — for z 0? 

z z 

What happens when |z| -► 1? 

6 . 6.2 What part of the z-plane corresponds to the interior of the unit circle in the w-plane 
if 

(a) w = Z -~~^, (b) xv — ? 

Z + 1 Z + l 

6 . 6.3 Discuss the transformations 

(a) w(z) — sin z, (c) w(z) = sinh z, 

(b) w(z) = cos z, (d) w(z) = cosh z. 

Show how the lines x = c i9 y = c 2 map into the w-plane. Note that the last three 
transformations can be obtained from the first one by appropriate translation 
and/or rotation. 

6 . 6.4 Show that the function 

w(z) “ (z 2 - 1) 1/2 

is sin%te-ra\ued \i we take 
- 1 < x < 1, y = 0 as a cut line. 

6 . 6.5 Show that negative numbers have logarithms in the complex plane. In particular, 

find ln( — 1). ANS. ln(-l ) = in. 

6.6.6 An integral representation of the Bessel function follows the contour in the t-plane 
shown in Fig. 6.23. Map this contour into the 6-plane with t = e 6 . Many additional 
examples of mapping are given in Chapters 11, 12, and 13. 

6.7 CONFORMAL MAPPING 

In Section 6.6 hyperbolas were mapped into straight lines and straight lines 
were mapped into circles. Yet in all these transformations one feature stayed 
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FIG. 6.24 Conformal mapping — preservation of angles 


constant. This constancy was a result of the fact that all the transformations of 
Section 6.6 were analytic. 

As long asw = /(z) is an analytic function, we have 

df dw r Aw ,, 

-j- = - r =km—. (6.106) 

dz dz az-o A z 

Assuming that this equation is in polar form, we may equate modulus to modulus 
and argument to argument. For the latter (assuming that df/dz f 0) 

.. Aw .. Aw 

arg lim —■ = lim arg - - 

Az~>0 A Z Az^O A z 

= lim arg Aw — lim argAz (6.107) 

A z-K) Az-+0 

df 

= arg-f- = oc, 
dz 

where a, the argument of the derivative, may depend on z but is a constant for a 
fixed z, independent of the direction of approach. To see the significance of this, 
consider two curves, C z in the z-plane and the corresponding curve C w in the 
w-plane (Fig. 6.24). The increment Az is shown at an angle of 0 relative to the 
real (x) axis, whereas the corresponding increment Aw forms an angle of cp 
with the real (u) axis. From Eq. 6.107 

<p = 0 + o, (6.108) 

or any line in the z-plane is rotated through an angle a in the w-plane as long 
as w is an analytic transformation and the derivative is not zero. 1 

Since this result holds for any line through z 0 , it will hold for a pair of lines. 
Then for the angle between these two lines 

<P2-<P 1 ^ (02 + a) - (01 +0i) = e 2 - 0 U (6.109) 

which shows that the included angle is preserved under an analytic trans- 
formation. Such angle-preserving transformations are called conformal. The 


1 If df/dz = 0, its argument or phase is undefined and the (analytic) trans- 
formation will not necessarily preserve angles. 
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rotation angle a will, in general, depend on z. In addition |/'(z)| will, usually 
be a function of z. 

Historically, these conformal transformations have been of great importance 
to scientists and engineers in solving Laplace’s equation for problems of electro- 
statics, hydrodynamics, heat flow, and so on. Unfortunately, the conformal 
transformation approach, however elegant, is limited to problems that can be 
reduced to two dimensions. The method is often beautiful if there is a high degree 
of symmetry present but often impossible if the symmetry is broken or absent. 
Because of these limitations and primarily because high-speed electronic com- 
puters offer a useful alternative (iterative solution of the partial differential 
equation), the details and applications of conformal mapping are omitted. 


EXERCISES 


6 . 7.1 Expand w(x) in a Taylor series about the point z = z 0 where /'(z 0 ) — 0. (Angles 
not preserved.) Show that if the first n — 1 derivatives vanish but f (n \z 0 ) ^ 0, then 
angles in the z-plane with vertices at z = z 0 appear in the w-plane multiplied by n. 


6 . 7.2 


Develop the transformations that create each of the four cylindrical coordinate 
systems: 

(a) Circular cylindrical x — p cos <p, 

y = p sin (p . 

(b) Elliptic cylindrical x ~ a cosh u cos v, 

y = a sinh u sin v. 


(c) Parabolic cylindrical 

(d) Bipolar 


x = en, 

y = ¥n 2 ~t 2 )- 

_ a sinh rf 
cosh rj — cos £ ’ 


_ a sin £ 
y cosh r] — cos £ 

Note. These transformations are not necessarily analytic. 


6 . 7.3 In the transformation 


. a — w 

e — 

a + w 

how do the coordinate lines in the z-plane transform? What coordinate system 
have you constructed? 
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7 FUNCTIONS OF A 
COMPLEX 
VARIABLE II 

CALCULUS OF 
RESIDUES 


7. 1 SINGULARITIES 

In this chapter we return to the line of analysis that started with the Cauchy- 
Riemann conditions in Chapter 6 and led on through the Laurent expansion 
(Section 6.5). The Laurent expansion represents a generalization of the Taylor 
series in the presence of singularities. We define the point z 0 as an isolated 
singular point of the function f(z) if f(z) is not analytic at z = z 0 but is analytic 
at neighboring points. A function that is analytic throughout the entire finite 
complex plane except for isolated poles is called meromorphic. 

Poles 

In the Laurent expansion of f(z) about z 0 

00 

/(z) = £ a„(z - z 0 f. (7.1) 

n— — go 

If a n = 0 for n < — m< 0 and ^ 0, we say that z 0 is a pole of order m. 
For instance, if m = 1; that is, if a^ 1 /(z — z 0 ) is the first nonvanishing term in 
the Laurent series, we have a pole of order one, often called a simple pole. 

If, on the other hand, the summation continues to n = — oo, the z 0 is a pole 
of infinite order and is called an essential singularity. These essential singularities 
have many pathological features. For instance, we can show that in any small 
neighborhood of an essential singularity of /(z) the function /(z) comes arbi- 
trarily close to any (and therefore every) preselected complex quantity Wq. 1 
Literally, the entire w-plane is mapped into the neighborhood of the point z 0 . 
One point of fundamental difference between a pole of finite order and an 
essential singularity is that a pole of order m can be removed by multiplying 
/(z) by (z — z 0 ) m . This obviously cannot be done for an essential singularity. 

The behavior of /(z) as z -+ oo is defined in terms of the behavior of /(I/O 
as t -> 0. Consider the function 


^his theorem is due to Picard. A proof is given by E. C. Titchmarsh, The 
Theory of Functions, 2nd ed. New York: Oxford University Press (1939). 
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(-l)V 


( 2 » + 1)1 


As z -> oo, we replace the z by 1 jt to obtain 


sin 7 = E 


(-D" 


t „= 0 (2n + l)!r 2n+1 


(7.2) 


(7.3) 


Clearly, from the definition, sinz has an essential singularity at infinity. This 
result could be anticipated from Exercise 6.1.9 since 

sin z = sin iy, when x = 0, 

= i sinh y, 

which approaches infinity exponentially as y -► oo. 


Branch Points 

There is another sort of singularity that will be important in the later sections 
of this chapter. Consider 

f{z) = 

in which a is not an integer. 2 As z moves around the unit circle from e° to e 2m , 

/(z) -► c 2 ™ ^ c 0/ , 

for nonintegral a. As in Section 6.6, we have a branch point. The points e° l 
and e 2ni in the z-plane coincide but these coincident points lead to different 
values of /(z); that is ,/(z) is a multivalued function. The problem is resolved 
by constructing a cut line so that f{z) will be uniquely specified for a given point 
in the z-plane. 

Note carefully that a function with a branch point and a required cut line 
will not be continuous across the cut line. In general, there will be a phase 
difference on opposite sides of this cut line. Hence line integrals on opposite 
sides of this branch point cut line will not generally cancel each other. Numerous 
examples of this appear in the exercises. 

The cut line used to convert a multiply connected region into a simply con- 
nected region (Section 6.3) is completely different. Our function is continuous 
across the cut line, and no phase difference exists. 

EXAMPLE 7.1.1 

Consider the function 

m = ( z 2 - 1) 1/2 = (z + l) 1/2 (z - l) 1 ' 2 . (7.4) 


2 z = 0 is technically a singular point, for z a has only a finite number of deriva- 
tives, whereas an analytic function is guaranteed an infinite number of deriva- 
tives (Section 6.4). The problem is that f(z) is not single-valued as we encircle 
the origin. The Cauchy integral formula may not be applied. 
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y 



FIG. 7.1 


The first factor on the right-hand side, (z + 1) 1/2 , has a branch point at z = — 1. 
The second factor has a branch point at z = + 1. To check on the possibility 
of taking the line segment joining z = + 1 and z= -1 as a cut line, let us 
follow the phases of these two factors as we move along the contour shown in 
Fig. 7.1. 

For convenience in following the changes of phase let z + 1 = re l ° and 
z — 1 = pe lq> . Then the phase of /(z) is (0 + q>)/ 2. We start at point 1 where 
both z + 1 and z — 1 have a phase of zero. Moving from point 1 to point 2, q > , 
the phase of z — 1 = pe l(f> increases by n. (z — 1 becomes negative.) <p then 
stays constant until the circle is completed, moving from 6 to 7. 0, the phase 
of z + 1 = re 10 shows a similar behavior increasing by 2n as we move from 
3 to 5. The phase of the function /(z) = (z + l) 1/2 (z — 1) 1/2 = r V 2 ^m e ne+q >)/2 j s 
(0 + <p)/2. This is tabulated in the final column of Table 7.1. 


TABLE 7.1 


Phase Angle 

Point 

0 

<P 

(0 + <p)l 2 

1 

0 

0 

0 

2 

0 

71 

71/2 

3 

0 

71 

nil 

4 

71 

71 

71 

5 

2n 

71 

371/2 

6 

2n 

71 

3tt/2 

7 

2k 

2tl 

2k 


Two features emerge; 

1. The phase at points 5 and 6 is not the same as the 
phase at points 2 and 3. This behavior can be excepted 
at a branch point cut line. 

2. The phase at point 7 exceeds that at point 1 by 2 n 
and the function /(z) = (z 2 — 1) 1/2 is therefore single- 
valued for the contour shown, encircling both branch 
points. 
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If we take the x-axis — 1 < x < 1 as a cut line, /(z) is uniquely specified. 
Alternatively, the positive x-axis for x > 1 and the negative x-axis for x < — 1 
may be taken as cut lines. The branch points cannot be encircled and the 
function remains single-valued. 

Generalizing from this example, we have that the phase of a function 
f(z) = J\(z)- f 2 (z)- Uz) ■ ■ ■ 

is the algebraic sum of the phase of its individual factors: 

ar g/(z) = arg/^z) + arg/ 2 (z) + arg/ 3 (z) + ■ • • . 

The phase of an individual factor may be taken as the arctangent of the ratio 
of its imaginary part to its real part, 

arg f(z) = tan 

For the case of a factor of the form 

fi(z) = (z - z 0 ) 


the phase corresponds to the phase angle of a two-dimensional vector from 
+ z 0 to z, the phase increasing by 2n as the point +z 0 is encircled. Conversely, 
the traversal of any closed loop not encircling z 0 does not change the phase of 
z-z 0 . 

As a final note on singularities, Liouville’s theorem (Exercise 6.4.8) states 
“A function that is everywhere finite ( bounded ) and analytic must be a constant .” 

This is readily proved by the use of Cauchy’s integral formula. Conversely, 
the slightest deviation of an analytic function from a constant value implies 
that there must be at least one singularity somewhere in the infinite complex 
plane. Apart from the trivial constant functions, then, singularities are a fact 
of life, and we must learn to live with them. But we shall do more than that. We 
shall use singularities to develop the powerful and useful calculus of residues. 


EXERCISES 


7,1 .1 The function f(z) expanded in a Laurent series exhibits a pole of order m at z = z 0 . 
Show that the coefficient of (z — z 0 )'\ is given by 


0-i = 


1 


(z 


- z 0 ) m j\z) , 

z — z n 


with 


1 (m- l)\dz m - 1 

a - 1 = [(z - z 0 )/(z)] 2 =z o > 


when the pole is a simple pole (m = 1). These equations for a_ x are extremely 
useful in determining the residue to be used in the residue theorem of the next 
section. 

Hint . The technique that was so successful in proving the uniqueness of power 
series, Section 5.7, will work here also. 
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7 . 1 .2 A function f(z ) can be represented by 

m 


fz(z) 


in which j\(z) and / 2 (z) are analytic. The denominator f 2 (z) vanishes at z = z 0 
showing that /(z) has a pole at z = z 0 . However, fi(z 0 ) 0, / 2 (z 0 ) 0- Show that 

fl_ ls the coefficient of (z — z 0 ) _1 in a Laurent expansion of /(z) at z = z 0 , is given 
by 

_ fi( z o) 

fi( z o) 


This result leads to the Heaviside expansion theorem, Section 15.12. 

7 . 1 .3 In analogy with Example 7.1.1 consider in detail the phase of each factor and the 
resultant overall phase of /(z) = (z 2 + 1) 1/2 following a contour similar to that of 
Fig. 7.1, but encircling the new branch points. 


7 . 1 .4 The Legendre function of the second kind, Q x {z\ has branch points at z = ±1. The 
branch points are joined by a cut line along the real (x) axis. 

(a) Show that Q 0 (z) = ^ln((z + 1 )/(z — 1)) is single-valued (with the real axis 
— 1 < x < 1 taken as a cut line). 

(b) For real argument x and |x| < 1 it is convenient to take 

g 0 (x) = |ln[(l + jc )/(1 -x)]. 

Show that 

GoW = ?lQo( x + '°) + Qo( x - ml 

Here x + iO indicates z approaches the real axis from above, x — iO indicates 
an approach from below. 

7 . 1 .5 As an example of an essential singularity consider e l/z as z approaches zero. For 
any complex number z 0y z 0 =/= 0, show that 


has an infinite number of solutions. 


7.2 CALCULUS OF RESIDUES 


Residue Theorem 

If the Laurent expansion of a function/(z) = Y^=-oo a n( z ~ z o T integrated 
term by term by using a closed contour that encircles one isolated singular 
point z 0 once in a counterclockwise sense, we obtain (Exercise 6.41) 


a n Uz - z 0 fdz = a n 


(Z - z o)" 


n -F 1 

= 0 for all n ^ - 1. 


However, if n — — 1, 

J) (z — z 0 )~ l dz = a^ 1 


ire ld dO 


re 


W~ = 2nia -v 


(7.5) 


(7.6) 


Summarizing Eqs. 7.5 and 7.6, we have 
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FIG. 7.2 Excluding isolated singularities 

±if(z)dz = a- 1 . (7.7) 

2m 

%/ 

The constant a_ 1? the coefficient of (z — z 0 ) -1 in the Laurent expansion, is called 
the residue of f(z) at z = z 0 . 

A set of isolated singularities can be handled very nicely by deforming our 
contour as shown in Fig. 7.2. Cauchy’s integral theorem (Section 6.3) leads to 

<j) f(z)dz + o f(z)dz + <j> f(z)dz + i) f(z)dz + • * • = 0. (7.8) 

Jc Jc 0 Jc t Jc 2 

The circular integral around any given singular point is given by Eq. 7.7. 

j> f(z)dz = -2nia. u . (7.9) 

assuming a Laurent expansion about the singular point, z = z f . The negative 

sign comes from the clockwise integration as shown in Fig. 7.2. Combining 

Eqs. 7.8 and 7.9, we have 

(b f(z)dz = 2ni(a_ lz + a_ l2 + a_ lz + •••) 

Jc 0 1 2 (7.10) 

= 2ni (sum of enclosed residues). 

This is the residue theorem. The problem of evaluating one or more contour 
integrals is replaced by the algebraic problem of computing residues at the 
enclosed singular points. 

We first use this residue theorem to develop the concept of the Cauchy 
principal value. Then in the remainder of this section we apply the residue 
theorem to a wide variety of definite integrals of mathematical and physical 
interest. In Section 7.3 the concept of Cauchy principal value is used to obtain 
the important dispersion relations. The residue theorem will also be needed in 
Chapter 16 for a variety of integral transforms, particularly the inverse Laplace 
transform. 

Cauchy Principal Value 

Occasionally an isolated first-order pole will be directly on the contour of 
integration. In this case we may deform the contour to include or exclude the 
residue as desired by including a semicircular detour of infinitesimal radius. 
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FIG. 7.3 By-passing singular points 


"C7 


This is shown in Fig. 7.3. The integration over the semicircle then gives 

nia_ i if counterclockwise, 

— nia^ 1 if clockwise. 

This contribution, + or — , appears on the left-hand side of Eq. 7.10. If our 
detour were clockwise, the residue would not be enclosed and there would be 
no corresponding term on the right-hand side of Eq. 7.10. However, if our 
detour were counterclockwise, this residue would be enclosed by the contour 
C and a term 2nia would appear on the right-hand side of Eq. 7.10. The net 
result for either clockwise or counterclockwise detour is that a simple pole on 
the contour is counted as one half what it would be if it were within the contour. 
This corresponds to taking the Cauchy principal value. 



FIG. 7.4 Closing the contour with an in- 
finite radius semicircle 


For instance, let us suppose that f(z) with a simple pole at z = x 0 is integrated 
over the entire real axis. The contour is closed with an infinite semicircle in the 
upper half-plane (Fig. 7.4). Then 

( 'Xn~d 


f(z)dz = 


f(x)dx + f(z)dz 


— 00 
(*co 


+ 


f(x)dx + 


(7.11) 


+ <5 Jc infinite semicircle 

= 2ni ]T enclosed residues. 


If the small semicircle C Xq includes x 0 (by going below the x-axis, counter- 
clockwise), x 0 is enclosed, and its contribution appears twice — as nia_ x in f c 

x ° 

and as 2nia_ 1 in the term 27 enclosed residues — for a net contribution of 
7 zia^ 1 . If the upper small semicircle is elected, x 0 is excluded. The only contribu- 
tion is from the clockwise integration over C Xo which yields — nia_ 1 . Moving 
this to the extreme right of Eq. 7.11, we have +nia_ u as before. 

The integrals along the x-axis may be combined and the semicircle radius 
permitted to approach zero. We have 

. p oo 'j poo 

f(x)dx+ f(x)dx\ = P | f(x)dx. (7.12) 

J — 00 J Xf 


lim 

5^0 
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J 'x 0 +d 

f(x) dx 

Xq~3 

P indicates the Cauchy principal value and represents the preceding limiting 
process. Note carefully that the Cauchy principal value is a balancing or cancel- 
ing process. In the vicinity of our singularity at z = x 0 , 

f{x) * (7.13) 

A Aq 

This is odd, relative to x 0 . The symmetric or even interval (relative to x 0 ) 
provides cancellation of the shaded areas, Fig. 7.5. The contribution of the 
singularity is in the integration about the semicircle. 

Sometimes, this same limiting technique is applied to the integration limits 
+ oo. We may define 

1*00 

P f(x)dx = lim f(x) dx. (7.14) 

J-oo 

An alternate treatment moves the pole off the contour and then considers 
the limiting behavior as it is brought back. This technique is illustrated in 
Example 7.2.4, in which the singular points are moved off the contour in such 
a way that the solution is forced into the form desired to satisfy the boundary 
conditions of the physical problem. 

Evaluation of Definite Integrals 

Definite integrals appear repeatedly in problems of mathematical physics as 
well as in pure mathematics. Three moderately general techniques are useful in 
evaluating definite integrals: (1) contour integration, (2) conversion to gamma 
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or beta functions (Chapter 10), and (3) numerical quadrature (Appendix A2). 
Other approaches include series expansion with term-by-term integration and 
integral transforms. As will be seen subsequently, the method of contour 
integration is perhaps the most versatile of these methods, since it is applicable 
to a wide variety of integrals. 


Evaluation of Definite Integrals — 
jg n /(sin 0, cos 9) d 9 

The calculus of residues is useful in evaluating a wide variety of definite 
integrals in both physical and purely mathematical problems. We consider, 
first, integrals of the form 


/ = 


/(sin 9, cos 6)d0, 


(7.15) 


where /is finite for all values of 6. We also require/ to be a rational function 
of sin 0 and cos 9 so that it will be single-valued. Let 


z — 


dz = ie w d9. 


From this, 


a- -A 

z 

7 — 7* 7 - 4-2 * 

sin 9 ~ , cos 9 = 

2 1 2 

(7.16) 

Our integral becomes 

i= -i 

r ( z — z 1 z + z l \dz 

M 2,' • 2 ),- 

(7.17) 


with the path of integration the unit circle. By the residue theorem, Eq. 7.10, 

/ = ( — i)27ii£ residues within the unit circle. (7.18) 

Note that we are after the residues of /(z)/z . Illustrations of integrals of this type 
are provided by Exercises 7.2.7 to 7.2.10. 


EXAMPLE 7.2.1 


Our problem is to evaluate the definite integral 


C 2n d9 
J 0 1 4- £ cos 9’ 


e| < 1. 


By Eq. 7.17 this becomes 


* unit circle z[l + (e/2)(z + z *)] 

= -2 f dz 

1 £ J z 2 4- (2/e)z + 1 ' 


The denominator has roots 
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z_ = — - — -yj 1 — e 2 and z + = — - + -yf t — e 2 . 

8 8 £8 

z + is within the unit circle; z_ is outside. Then by Eq. 7.18 and Exercise 7.1.1 

1 


/ = —i-'2ni , 

8 z + 1 /e + (l/e)y/l — E 2 


We obtain 


In 


dO 


2n 


0 I + e cos 0 


Z= -l/fi + (l/e)Vl -£2 


€ < I- 


Evaluation of Definite Integrals — 

|" 0O /(x)dx 

Suppose that our definite integral has the form 

poo 

I = /(x)dx 

J - 00 

and satisfies the two conditions : 

a. f(z) is analytic in the upper half-plane except for a 
finite number of poles. (It will be assumed that there 
are no poles on the real axis. If poles are present on 
the real axis, they may be included or excluded as 
discussed earlier in this section.) 

b. f(z) vanishes as strongly 1 as 1 /z 2 for |z| -► oo, 

0 < arg z <n. 



(7.19) 


With these conditions, we may take as a contour of integration the real axis 
and a semicircle in the upper half-plane as shown in Fig. 7.6. We let the radius R 
of the semicircle become infinitely large. Then 

f(x)dx+ lim | f(Re i0 )iRe w dO 

R ^°° Jo (7.20) 

= 2 ni £ residues (upper half-plane). 


<p f{z)dz — lim 

1 oo 


J-R 


x We could use /(z) vanishes faster than 1/z, but we wish to have f(z) single- 
valued. 



406 FUNCTIONS OF A COMPLEX VARIABLE II 


From the second condition the second integral (over the semicircle) vanishes 
and 


f(x)dx = 2ni £ residues (upper half-plane). 


EXAMPLE 7.2.2 


Evaluate 


From Eq. 7.21 


dx 

1 + x 2 


= 2ni £ residues (upper half-plane). 


Here and in every other similar problem we have the question — where are the 
poles? Rewriting the integrand as 

~ 2 ~ — r = — - — : * — ~ (7.23) 
z 2 - hi z -f l z — l 

we see that there are simple poles (order 1) at z = i and z = —i. 

A simple pole at z = z 0 indicates (and is indicated by) a Laurent expansion of 
the form 

f(z) = + a 0 + E a n( z - z o T- (7-24) 

z — z o „ = 1 

The residue a_ x is easily isolated as (Exercise 7.1.1) 

a- 1 = (z - z 0 )f(z) | z =z 0 - (7-25) 

Using Eq. 7.25, we find that the residue at z = i is l/2i, whereas that at z = —i 
is —1/2 i. 

Then /.«> 


dx 

1 + x 2 


2ni • ^ = 7i. 
2 i 


Here we have used a_ x = 1/2 i for the residue of the one included pole at z = i. 
Readers should satisfy themselves that it is possible to use the lower semicircle 
and that this choice will lead to the same result, I — n. A somewhat more delicate 
problem is provided by the next example. 

Evaluation of Definite Integrals — 
lZ 0 O f(x)e iox dx 

Consider the definite integral 

/* 00 

/= f(x)e iax dx, (7.27) 
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with a real and positive. This is a Fourier transform, Chapter 15. We assume the 
two conditions: 

a. f(z) is analytic in the upper half-plane except for a 
finite number of poles. 

b. 


lim f(z) = 0, 0<argz<7r. (7.28) 

|z|-> 0 O 

Note that this is a less restrictive condition than the second condition imposed 
on f(z) for integrating ^ o0 f(x)dx previously. 

y 

l 


2 ° FIG. 7.7 (a) y = (2jn)0, (b) y = sin 0 

We employ the contour shown in Fig. 7.6. The application of the calculus of 
residues is the same as the one just considered, but here we have to work a little 
harder to show that the integral over the (infinite) semicircle goes to zero. This 
integral becomes 

/* = j f(Re m )e iaR cos ^ aR sin e iRe‘ e d6. (7.29) 



Let R be so large that |/(z)| = \f(Re ie )\ < e. Then 


|/ R | < eR 


-ciR sin 6 


d0 


nil 


= 2eR\ e~ aRsine dO. 

Jo 


In the range [0, n/2] 


(7.30) 


: 6 < s in 0 


Therefore (Fig. 7.7) 

Ml 

|/ K | < 2d R e~ aR2eln dO. 

Jo 

Now, integrating by inspection, we obtain 

1 - 


\I R \ < 2eR 


aR2/n 


Finally, 


lim \I R \< —e. 
R-oo 1 R| a 


(7.31) 


(7.32) 
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From Eq. 7.28 e -> 0 as R -» oo and 

lim |/J = 0. (7.33) 

R-*co 

This useful result is sometimes called Jordans lemma. With it, we are prepared to 
tackle Fourier integrals of the form shown in Eq. 7.27. 

Using the contour shown in Fig. 7.6, we have 

/'CO 

f(x)e iax dx 4- lim I R = 2ni Y residues (upper half-plane). 

R-> oo 

J — QO 

Since the integral over the upper semicircle I R vanishes as R -> oo, (Jordan’s 
lemma), 

/* 00 

f(x)e iax dx = 2 7n^T residues (upper half-plane), (a > 0) (7.34) 

J — 00 


EXAMPLE 7.2.3 Singularity on Contour of Integration 
The problem is to evaluate 

* n ° sinx 


/ = 


-dx. 


This may be taken as the imaginary part 2 of 

l-p r e “ dz 


(7.35) 


'(7-36) 


Now the only pole is a simple pole at z = 0 and the residue there by Eq. 7.25 
is = 1. We choose the contour shown in Fig. 7.8 (1) to avoid the pole, (2) to 



FIG. 7.8 

include the real axis, and (3) to yield a vanishingly small integrand for z = iy, 
y oo. Note that in this case a large (infinite) semicircle in the lower half-plane 


2 One can use j[(e iz — e iz )j2iz] dz, but then two different contours will be 
needed for the two exponentials (compare Example 7.2.4). 
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would be disastrous. We have 


/* 

o 

v 


e lz dz 
z 


- R 


ix dx , 

e lx h 

x 


e lz dz 


4- 


' R e lx dx 
x 



(7.37) 


the final zero coming from the residue theorem (Eq. 7.10). By Jordan’s lemma 


and 



(7.38) 


f 


e lz dz 
z 



+ P 


e ix dx 

x 

J —<x> 


= 0 . 


(7.39) 


The integral over the small semicircle yields ( — ) ni times the residue of 1, minus, 
as a result of going clockwise. Taking the imaginary part, 3 we have 


P ^rdx = n (7.40) 

J — 00 


or 

”°° sinx , 7t ... 

-dx = ~. (7.41) 

Jo x 2 

The contour of Fig. 7.8, although convenient, is not at all unique. Another 
choice of contour for evaluating Eq. 7.35 is presented as Exercise 7.2.15. 


EXAMPLE 7.2.4 Quantum Mechanical Scattering 


The quantum mechanical analysis of scattering leads to the function 


1(a) = 



xsin xdx 


(7.42) 


where a is real and positive. From the physical conditions of the problem there 
is a further requirement: 1(a) is to have the form e io so that it will represent an 
outgoing scattered wave. 

Using 


sin z = - sinh iz 
i 


1 iz 1 _ 
= w: e ~ 

2 1 2 1 


(7.43) 


3 Alternatively, we may combine the integrals of Eq. 7.37 as 


r r ix dx r ix dx 


„~ix\dx . 



410 FUNCTIONS OF A COMPLEX VARIABLE II 


we write Eq. 7.42 in the complex plane as 


m = h + J 2 , (7.44) 

with 



(7.45) 


Integral f is similar to Example 7.2.3 and, as in that case, we may complete the 
contour by an infinite semicircle in the upper half-plane. For 1 2 the exponential 
is negative and we complete the contour by an infinite semicircle in the lower 
half-plane, as shown in Fig. 7.9. As in Example 7.2.3, neither semicircle con- 
tributes anything to the integral — Jordan’s lemma. 



There is still the problem of locating the poles and evaluating the residues. 
We find poles at z = + o and z = —a on the contour of integration. The residues 
are (Exercises 7.1.1, 7.2.1): 



Detouring around the poles, as shown in Fig. 7.9 (it matters little whether we go 
above or below), we find that the residue theorem leads to 


PI i 


. 1 \e 
-ni[- - 
k 2 1 , 


+ ni 




2 i 2 


(7.46) 


for we have enclosed the singularity at z = o but excluded the one at z = —o. 
In similar fashion, but noting that the contour for I 2 is clockwise , 
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PI 2 


— ni 



e ia . 

Y + m 



= —2ni 


l\e b 


2 i 


Adding Eqs. 7.46 and 7.47, we have 


PI (a) = Pli + PI 2 = |( e ia + e~ ia ) = ncoshia 

= n cos a. 


(7.47) 


(7.48) 


This is a perfectly good evaluation of Eq. 7.42, but unfortunately the cosine 
dependence is appropriate for a standing wave and not for the outgoing scattered 
wave as specified. 

To obtain the desired form, we try a different technique. Instead of dodging 
around the singular points, let us move them off the real axis. Specifically, let 
o a -1- iy, — cr -> —o — iy, where y is positive but small and will eventually be 
made to approach zero, that is, 


i + (cr) = lim/(<7 + iy). 
y~* o 


(7.49) 


With this simple substitution, the first integral 1 { becomes 

/ 1 \ e i{a+iy) 
iM + iy) = tofe)— 

by direct application of the residue theorem. Also, 

/— l\e i(<T+iy) 

/,(.+ «- -2«i(— j— • 

Adding Eqs. 7.50 and 7.51 and then letting y -> 0, we obtain 
h(o) = lim [/ 1 (cr + iy) + I 2 {a + iy)] 

y -^0 

= lim ne i(a+iy) = ne ia , 

y^o 


(7.50) 


(7.51) 


(7.52) 


a result that does fit the boundary conditions of our scattering problem. 

It is interesting to note that the substitution a -► a — iy would have led to 

/_(<x) = 7i<r^, (7.53) 

which could represent an incoming wave. Our earlier result (Eq. 7.48) is seen to 
be the arithmetic average of Eqs. 7.52 and 7.53. This average is the Cauchy 
principal value of the integral. Note that we have these possibilities (Eqs. 7.48, 
7.52, and 7.53) because our integral is an improper integral. It is not uniquely 
defined until we specify the particular limiting process (or average) to be used. 


Evaluation of Definite Integrals — 

Exponential Forms 

With exponential or hyperbolic functions present in the integrand, life gets 
somewhat more complicated than before. Instead of a general overall prescrip- 
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tion, the contour must be chosen to fit the specific integral. These cases are also 
opportunities to illustrate the versatility and power of contour integration. 

As an example, we consider an integral that will be quite useful in developing 
a relation between z ! and ( — z ) !. Notice how the periodicity along the imaginary 
axis is exploited. 


EXAMPLE 7.2.5 Factorial Function 


We wish to evaluate 


I = 



1 + e 


: dx , 


0 < a < 1. 


(7.54) 


The limits on a are necessary (and sufficient) to prevent the integral from diverg- 
ing as x -> ±co. This integral (Eq. 7.54) may be handled by replacing the real 
variable x by the complex variable z and integrating around the contour shown 
in Fig. 7.10. If we take the limit as R -> oo, the real axis, of course, leads to the 
integral we want. The return path along y = Inis chosen to leave the denomina- 
tor of the integral invariant, at the same time introducing a constant factor 
e i2na in the numerator. We have, in the complex plane, 


— dz — lim 

1 -1- e z R- co 


= (l-e 


R 
Unci') 


1 +e x 


-dx — e 


ilna 


,1+e- 


:dx 


(7.55) 


1 + 


-dx. 


v 



In addition there are two vertical sections (0 < y < 2n\ which vanish (exponen- 
tially) as R -» go. 

Now where are the poles and what are the residues? We have a pole when 

e z = e x e iy = -1. (7.56) 

Equation 7.56 is satisfied at z = 0 + in. By a Laurent expansion 4 in powers of 


4 l +e z =l +e z ' in e in 
= 1 - 


= -(z - in) 



z — in 


+ 


(z ~ in) 2 



2! 


3! 
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(z — in) the pole is seen to be a simple pole with a residue of — e llta . Then, apply- 
ing the residue theorem once more, 


(1 - e i2na ) 



p ax 

-dx = 2ni( — e l1ta ). 

1 + e x 


This quickly reduces to 


(* 00 


J — 00 


e fl * n 

1 + e x X sinew’ 


0 < a < 1. 


(7.57) 


(7.58) 


Using the beta function (Section 10.4), we can show the integral to be equal to 
the product ( a — 1)! ( — a)!. This results in the interesting and useful factorial 
function relation 


a !( — a) ! = (7.59) 

sin na 

Although Eq. 7.58 holds for real a, 0 < a < 1, Eq. 7.59 may be extended by 
analytic continuation to all values of a, real and complex, excluding only real 
integral values. 

As a final example of contour integrals of exponential functions, we consider 
Bernoulli numbers again. 


EXAMPLE 7.2.6 Bernoulli Numbers 


In Section 5.9 the Bernoulli numbers were defined by the expansion 



00 


s 

n = 0 



(7.60) 


Replacing x with z (analytic continuation), we have a Taylor series (compare 
Eq. 6.60) with 


B 


n 


n ! f z dz 

Ini L <F^T I"* 1 
*'<■0 


(7.61) 


where the contour C 0 is around the origin counterclockwise with |z| < 2 n to 
avoid the poles at ± 2ni. 

For n = 0 we have a simple pole at z = 0 with a residue of + 1. Hence by 
Eq. 7.10 


B 0 = 


OJ 

2n i 


• 2ni(l) = 1. 


(7.62) 


For n = 1 the singularity at z = 0 becomes a second-order pole. The residue may 
be shown to be — \ by series expansion of the exponential, followed by a binomial 
expansion. This results in 


B l 


~-2ni 

2m 


1 

2 ' 


(7.63) 
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FIG. 7.11 Contour of integration for 
Bernoulii numbers 


For n >2 this procedure becomes rather tedious, and we resort to a different 
means of evaluating Eq. 7.61. The contour is deformed, as shown in Fig. 7.11. 

The new contour C still encircles the origin, as required, but now it also 
encircles (in a negative direction) an infinite series of singular points along the 
imaginary axis at z — ± p2ni, p = 1, 2, 3, .... The integration back and forth 
along the x-axis cancels out, and for R oo the integration over the infinite 
circle yields zero. Remember that n > 2. Therefore 

<j) = Z r ^idues (z = ±p2ni). (7.64) 

Jc 0 e ~ lz p=i 

At z — p2ni we have a simple pole with a residue (p2ni)~ n . When n is odd, the 
residue from z — p2ni exactly cancels that from z = — p2ni and B n odd = 0, 
n — 3, 5, 7, and so on. For n even the residues add, giving 


n i ™ 1 

B„ = ~X-2m)2 y 

2 m p =i p {2m f 


( — l)" /2 2n ! 
(2 nf 

( — l) n/2 2w ! 
(2 n)" 


Ip'" 

p =i 

C(n) (n even), 


(7.65) 


where £(n) is the Riemann zeta function introduced in Section 5.9. Equation 7.65 
corresponds to Eq. 5.151 of Section 5.9. 

Branch Points, Cut Lines 

Sometimes the integrand will contain z to a fractional power. The integrand 
is multivalued. There is a branch point and a cut line is required. Exercises 
7.2.18, 7.2.19, and 7.2.23 are examples of this situation. A key point to remember 
is that the function can be expected to be discontinuous across this mandatory 
cut line. The integral along one side of the cut line will probably not equal the 
integral along the other side. 
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EXERCISES 

7.2.1 


Determine the nature of the singularities of each of the following functions and 
evaluate the residues ( a > 0). 


(a) 

1 

(b) 

1 

z 2 + a 2 

(z 2 + a 2 ) 2 

(c) 

z 2 

(d) 

sin 1/z 

(z 2 + a 2 ) 2 ' 

z 2 + u 2 ' 

(e) 

ze +iz 

(f) 

ze +iz 

z 2 + a 2 

z 2 — a 2 

(g) 

e+iz 

~ ~ . 

(h) 

z~ k 


0 < /c < 1. 


7.2.2 


7.2.3 


7.2.4 


7.2.5 


Locate the singularities and evaluate the residues of each of the following 
functions 


(a) z~ n (e z 

•7 ^ 0 Z 

(b) 


1 )“\ ^ + 0 , 


1 + e 2 


The statement that the integral halfway around a singular point is equal to 
one half the integral all the way around was limited to simple poles. Show, by 
a specific example, that 


J 5 


f(z)dz = i 

Semicircle ^ 


f(z)dz 


does not necessarily hold if the integral encircles a pole of higher order. 

Hint. Try f(z) = z~ 2 . 

A function f(z) is analytic along the real axis except for a third -order pole at 
z = Xq. The Laurent expansion about z = x 0 has the form 


f(z) = 


+ - 


(z - x 0 ) 3 (z - x 0 ) 


; + g(z% 


with g(z) analytic at z = x 0 . Show that the Cauchy principal value technique 
is applicable in the sense that 


(a) lim 

< 5-0 


f(x) dx + 

D 

is well behaved. 

f(z)dz = ±ina. l . 


fix) dx [ 


(b) 

where C X(> denotes a small semicircle about z = x 0 . 

The unit step function is defined as (compare Exercise 8.7.13) 

jO, s <a 

u(s - a) = < 

U. « > a- 

Show that u(s) has the integral representations 
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(a) u(s ) = lim — 
£ -*o + 2ni 


2ni x — ie 

•> — 00 


1 i r°° p ixs 

(b) u(s) = - + — P — dx. 

2 2ni _ x 

J — oo 

Note. The parameter s is real. 

7 . 2.6 Most of the special functions of mathematical physics may be generated 
(defined) by a generating function of the form 

g(t,x) = £/„(x)r 

n 

Given the following integral representations, derive the corresponding generat- 
ing function : 

(a) Bessel 


(b) Modified Bessel 


J n (x) = — oe ix,2)it m t n 1 dt. 
2 ni 


/„(x) = Oe (x/ 2)0 + l/t) t n l fa 

2m 


(c) Legendre 


(d) Hermite 


(e) Laguerre 


p n (x) = — o(i - 2 tx + t 2 y il2 r n ~ l dt 
2ni 


x t~ n ~ l dt. 


j r e -xt/(i-t) 

LJx) = — o — rdt. 

2niJ (1 - t)t n+1 


(f) Chebyshev 

47ti J (1 — 2tx + t 2 ) 

Each of the contours encircles the origin and no other singular points. 


7 . 2.7 Generalizing Example 7.2.1, show that 

* 2 ” dO C 2n dO ^ 2n 

%0 a±bcosO J 0 a±bsinO (a 2 — b 2 ) 1/2 ’ 

What happens if \b\ > lal? 


for a > \b\. 


7 . 2.8 Show that 


7 . 2.9 Show that 


dO _ na 
(a + cosd) 2 {a 2 — 1) 3/2 ’ 


1 — 2tcos0 + t 2 1 — t 


r, for iti < 1. 


What happens if |t| > 1? 
What happens if £ = 1? 
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7.2.10 


7.2.11 


7.2.12 


7.2.13 


7.2.14 


7.2.15 


7.2.16 


With the calculus of residues show that 
cos 2 " 6d6 = n 


r 


(2")! _ (2n-l)!! 

h2 71 


2 2 "( n!) 2 


(2w) ! ! 


w = 0, 1, 2, ... . 


(The double factorial notation is defined in Section 10.1). 
Hint . cos 6 = ^(e' 0 + e _,<? ) = |(z + z~ J ), |z| = 1. 


Evaluate 


cos bx — cos ax 


dx, 


a > b > 0. 


ANS. n(a - b). 


Prove that 


sin 2 x . n 
— z—dx ~ — . 


Hint. sin 2 x = ^(1 — cos2x). 


A quantum mechanical calculation of a transition probability leads to the 
function f(t,co) = 2(1 — cos <xrt)/co 2 . Show that 



/(£, (jo) dco = 2 nt. 


Show that (a > 0) 
cosx 


(a) 


(b) 


x z + or 


, 71 _ 

dx = — e 


How is the right side modified if cosx is replaced by cos fcx? 
xsinx 


( 


- dx = 7T£ a . 


x z + a z 


How is the right side modified if sinx is replaced by sin/cx? 

These integrals may also be interpreted as Fourier cosine and sine transforms — 
Chapter 15. 


Use the contour shown (Fig. 7.12) with R -> oo to prove that 



n. 


y 


R + iR 

1 R 4- iR 


, S 

A . 


- 

R — r 

r R 


FIG. 7.12 


In the quantum theory of atomic collisions we encounter the integral 


/ - 


sin t 


e ipt dt 


— 00 


t 
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in which p is real. Show that 


/ = 0, \p\ > 1 

/ = 71 , |p| < 1. 


What happens if p — ± 1 ? 
7 . 2.17 Evaluate 


00 (lnx ) 2 

n 1 + X 2 


dx. 


(a) by appropriate series expansion of the integrand to obtain 

00 

4 £ + ir\ 

ft — 0 

(b) by contour integration to obtain 

JL 3 

8 ' 

Hint . x z = e*. Try the contour shown in Fig. 7.13, letting R -+ oo. 

T 


— R + in j| 

\ \ 

! + iv 


t 

\ , 1 

1 



FIG. 7.13 


7 . 2.18 Show that 

J 0 (x H- l) z sm na 

where — 1 < a < 1. Here is still another way of deriving Eq. 7.59. 

Hint. Use the contour shown in Fig. 7.14, noting that z = 0 is a branch point 
and the positive x-axis is a cut line. Note also the comments on phases following 
Example 7.1.1. 


y 



FIG. 7.14 


/'OO 

0 


X 

X + 1 


dx = 


Tt 

sin an 
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FIG. 7.15 


where 0 < a < 1. This opens up another way of deriving the factorial function 
relation given by Eq. 7.59. 

Hint. You have a branch point and you will need a cut line. Recall that z -a = w 
in polar form is 


[re i(d+2 nn) y a = pe iq> , 
which leads to — a6 — 2ann = cp. 

You must restrict n to zero (or any other single integer) in order that cp may 
be uniquely specified. Try the contour shown in Fig. 7.15. 


7 . 2.20 Show that 


7 . 2.21 Evaluate 


7 . 2.22 Show that 


r°° dx _ 7i 

l 0 (x 2 + a 2 ) 2 4a 3 ’ 


a > 0. 



1 + x 4 


-rdx. 


ANS . n/y/ 2 


f 


cos (t 2 )dt = 


^Jh 


. )d ' " iji 


Hint . Try the contour shown in Fig. 7.16. 

Note. These are the Fresnel integrals for the special case of infinity as the upper 
limit. For the general care of a varying upper limit, asymptotic expansions of 
the Fresnel integrals are the topic of Exercise 5.1 1.2. Spherical Bessel expansions 
are the subject of Exercise 11.7.13. 


7 . 2.23 


Several of the Bromwich integrals, Section 15.12, involve a portion that may be 
approximated by 


i(y) = 


n a+iy 

Jo+iy 



dz. 


Here a and t are positive and finite. Show that 


lim I(y) = 0. 

y~*oo 
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'Mii 



FIG. 7.16 


’00 . / 

1 . nn 

dx = — 

0 1 + x" sin(7i/n) 


. Try the contour shown in Fig. 7.17. 



FIG. 7.17 


Show that 


)\z) = z 4 - 2 cos 2 9z 2 + 1 

has zeros at e id , e ~ id , — e ie , and - e ~ id . 

Show that 


x 4 — 2 cos 20 x 2 4- 1 2 sin 0 


2 1/2 (1 — cos 20) 1/2 

cise 7.2.24 (n = 4) is a special case of this result. 
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7 . 2.26 Show that 


x 2 dx _ n 

x 4 — 2 cos 20 x 2 + 1 2 sin 6 


n 

~2 1/2 (1 -cos2 0) 1/2 ' 

Exercise 7.2.21 is a special case of this result. 


7 . 2.27 Apply the techniques of Example 7.2.4 to the evaluation of the improper 
integral 



(a) Let <j — > <j + iy. 

(b) Let g -* o — iy. 

(c) Take the Cauchy principal value. 


7 . 2.28 


The integral in Exercise 7.2.17 may be transformed into 


™0C 

Jo 




y 


2 


1 + e -2 * 



Evaluate this integral by the Gauss-Laguerre quadrature, Appendix A2, and 
compare your result with n 3 /16. 

ANS. Integral = 1.93775 (10 points). 


7.3 DISPERSION RELATIONS 

The concept of dispersion relations entered physics with the work of Kronig 
and Kramers in optics. The name dispersion comes from optical dispersion, 
a result of the dependence of the index of refraction on wavelength or angular 
frequency. The index of refraction n may have a real part determined by the 
phase velocity and a (negative) imaginary part determined by the absorption — 
see Eq. 7.79. Kronig and Kramers showed that the real part of ( n 2 — 1) could 
be expressed as an integral of the imaginary part. Generalizing this, we shall 
apply the label dispersion relations to any pair of equations giving the real 
part of a function as an integral of its imaginary part and the imaginary part 
as an integral of its real part — Eqs. 7.71a and 7.71b that follow. The existence 
of such integral relations might be suspected as an integral analog of the 
Cauchy-Riemann differential relations, Section 6.2. 

The applications in modern physics are widespread. For instance, the real 
part of the function might describe the forward scattering of a gamma ray in 
a nuclear Coulomb field (a dispersive process). Then the imaginary part would 
describe the electron-positron pair production in that same Coulomb field 
(the absorptive process). As will be seen later, the dispersion relations may be 
taken as a consequence of causality and therefore are independent of the details 
of the particular interaction. 

We consider a complex function f(z) that is analytic in the upper half-plane 
and on the real axis. We also require that 
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lim |/(z)| = 0, 0 < argz < n, (7.66) 

|z|-»ao 

in order that the integral over an infinite semicircle will vanish. The point of 
these conditions is that we may express f(z) by the Cauchy integral formula, 
Eq. 6.43, 

(7 ' 67) 

The integral over the upper semicircle 1 vanishes and we have 

/(z 0 ) = J-. r (7.68) 

^ 711 J-oo X * 0 

The integral over the contour shown in Fig. 7.18 has become an integral along 
the x-axis. 

Equation 7.68 assumes that z 0 is in the upper half-plane — interior to the 
closed contour. If z 0 were in the lower half-plane, the integral would yield zero 
by the Cauchy integral theorem, Section 6.3. Now, either letting z 0 approach 
the real axis from above (z 0 -+ x 0 ), or placing it on the real axis and taking an 
average of Eq. 7.68 and zero, we find that Eq. 7.68 becomes 

f(x 0 ) = ±P f ^-dx, (7.69) 

where P indicates the Cauchy principal value. 

Splitting Eq. 7.69 into real and imaginary parts 2 yields 


f(x 0 ) = u(x 0 ) + iv{x 0 ) 

_ 1 n r V (x ) 


Finally, equating real part to real part and imaginary part to imaginary part, 
we obtain 


^he use of a semicircle to close the path of integration is convenient, not 
mandatory. Other paths are possible. 

2 The second argument, y — 0, is dropped. w(x o ,0) -► w(* 0 )- 
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u(x 0 ) = -P dx (7.71 a) 

n J-oo X ~ X 0 

v(x 0 ) = -~P f ~~~dx. (7.71 b) 

7l I A A A 

J -00 u 

These are the dispersion relations. The real part of our complex function is 
expressed as an integral over the imaginary part. The imaginary part is expressed 
as an integral over the real part. The real and imaginary parts are Hilbert 
transforms of each other. Note that these relations are meaningful only when 
f(x) is a complex function of the real variable x. Compare Exercise 7.3.1. 

From a physical point of view u{x) and/or v(x) represent some physical 
measurements. Then f(z) = u(z) + iv(z) is an analytic continuation over the 
upper half-plane, with the value on the real axis serving as a boundary condition. 


Symmetry Relations 

On occasion f(x) will satisfy a symmetry relation and the integral from — oo 
to + qo may be replaced by an integral over positive values only. This is of 
considerable physical importance because the variable x might represent a 
frequency and only zero and positive frequencies are available for physical 
measurements. Suppose 3 

/(-x) = /*(x). (7.72) 

Then 


u( — x) + iv( — x) = u(x) — iv(x). 


(7.73) 


The real part of /(x) is even and the imaginary part is odd. 4 In quantum me- 
chanical scattering problems these relations (Eq. 7.73) are called crossing 
conditions. To exploit these crossing conditions, we rewrite Eq. 7.71a as 


u(x 0 ) = -P 

71 


^-dx+'-P 


t;(x) 


dx. 


(7.74) 


Letting x — x in the first integral on the right-hand side of Eq. 7.74 and 
substituting v( — x) = —v(x) from Eq. 7.73, we obtain 




0* 
csi j 

II 

r f<*> a 

n J 

0 X “ xg 


+ 


X — Xr 


dx 


(7.75) 


Similarly, 


3 This is not just a happy coincidence. It ensures that the Fourier transform 
of /(x ) will be real. In turn, Eq. 7.72 is a consequence of obtaining /’(x) as the 
Fourier transform of a real function. 

4 u(x, 0) =u{ — x, 0), v(x, 0) = — v( — x, 0). Compare these symmetry condi- 
tions with those that follow from the Schwarz reflection principle, Section 
6.5. 
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t*x 0 ) = --P f ~ >U - X) 2 dx. (7.76) 

n Jo x ~ x o 

The original Kronig-Kramers optical dispersion relations were in this form. 
The asymptotic behavior (x 0 — ► oo) of Eqs. 7.75 and 7.76 lead to quantum 
mechanical sum rules , Exercise 7.3.4. 


Optical Dispersion 

The function exp [i(kx — cot)] describes a wave moving along the x-axis in 
the positive direction with velocity v — co/k. co is the angular frequency, k the 
wave number or propagation vector, and n = ck/co the index of refraction. 
From Maxwell’s equations, electric permittivity £, and Ohm’s law with con- 
ductivity o the propagation vector k for a dielectric becomes 5 

(with fi, the magnetic permeability taken to be unity). The presence of the 
conductivity (which means absorption) gives rise to an imaginary part. The 
propagation vector k (and therefore the index of refraction n) have become 
complex. 

Conversely, the (positive) imaginary part implies absorption. For poor 
conductivity ( 4no/coe « l)a binomial expansion yields 

j r co , . 2na 

k = Je h i — ■= 

v c Cyfe 

and 

gi(kx~cot) __ gi<x>(y/ r e/c-t)g~ 27 tax/c y /-£ 

an attenuated wave. 

Returning to the general expression for /c 2 , we find that Eq. 7.77 the index of 
refraction becomes 


c 2 k 2 


.471(7 


= p = 8 + l- 

O) OJ 


(7.78) 


We take n 2 to be a function of the complex variable co (with s and o depending 
on co). However, n 2 does not vanish as co oo but instead approaches unity. 
So to satisfy the condition, Eq. 7.66, one works with /(co) = n 2 (co) - 1. The 
original Kronig-Kramers optical dispersion relations were in the form of 


®\n 2 (to 0 ) - 1] = -P f 2 ^ do,, 

71 Jo co — co 0 

./[»„> - 1] - - 2 -p f > « p ( 


2 2 
CO — COo 


(7.79) 


5 See J. D. Jackson, Classical Electrodynamics , 2nd ed., Section 7.7, New York : 
Wiley (1975). Equation 7.77 follows Jackson in the use of Gaussian units. 
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Knowledge of the absorption coefficient at all frequencies specifies the real 
part of the index of refraction and vice versa. 


The Parseval Relation 

When the functions u(x) and v(x) are Hilbert transforms of each other and 
each is square integrable, 6 the two functions are related by 

r*co 

\u(x)\ 2 dx= \v(x)\ 2 dx. 


f 


This is the Parseval relation. 

To derive Eq. 7.80, we start with 


f*oo r*co . 

\u(x)\ 2 dx= - 

J — 00 J — 00 * 


v(s)ds 1 f°° v(t)dt 


s — x n 


dx , 


using Eq. 7.71a twice. 

Integrating first with respect to x, we have 

{*00 (*oo 

\u(x)\ 2 dx = 


1 


dx 


(s — x)(t — x) 


From Exercise 7.3.8 the x integration yields a delta function: 


1 


dx 


As - x)(t - x) 


= S(s - t). 


(7.80) 


v(s)dsv(t)dt. (7.81) 


We have 


”00 f*oo (*oo 

\u(x)\ 2 dx = v(s)S(s — t)dsv(t)dt. (7.82) 

J— 00 J—aoJ—OO 

Then the s integration is carried out by inspection, using the defining property 
of the delta function. 

(*oo 

u(s)<5(s — t)ds ~ v(t). (7.83) 

J — QO 

Substituting Eq. 7.83 into Eq. 7.82, we have Eq. 7.80, the Parseval relation. 
Again, in terms of optics, the presence of refraction over some frequency range 
(; n 1) implies the existence of absorption and vice versa. 


Causality 

The real significance of the dispersion relations in physics is that they are a 
direct consequence of assuming that the particular physical system obeys 
causality. Causality is awkward to define precisely but the general meaning is 
that the effect cannot precede the cause. A scattered wave cannot be emitted 
by the scattering center before the incident wave has arrived. For linear systems 
the most general relation between an input function G (the cause) and an output 
function H (the effect) may be written as 


6 This means that \u{x)\ 2 dx and \v(x)\ 2 dx are finite. 
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H(t) = F(t-t')G(t')dt'. (7.84) 

J -GO 

Causality is imposed by requiring that 

F(t — t') = 0 for t — t' < 0. 

Equation 7.84 gives the time dependence. The frequency dependence is obtained 
by taking Fourier transforms. By the Fourier convolution theorem, Section 
15.5, 


h(to) = 


where/(co) is the Fourier transform of F(f), and so on. Conversely, F{t) is the 
Fourier transform of f(co). 

The connection with the dispersion relations is provided by the Titchmarsh 
theorem. 7 This states that if f(co) is square integrable over the real co-axis, then 
any one of the following three statements implies the other two. 

1. The Fourier transform of f(co) is zero for t < 0: Eq. 

7.84. 

2. Replacing co by z, the function /(z) is analytic in the 
complex z plane for y > 0 and approaches f(x ) almost 
everywhere as y 0. Further, 


| f(x + iy)\ 2 dx < K 


for y > 0, 


that is, the integral is bounded. 

3. The real and imaginary parts of /(z) are Hilbert 
transforms of each other: Eqs. 7.71a and Ulb. 


The assumption that the relationship between the input and the output of 
our linear system is causal (Eq. 7.84) means that the first statement is satisfied. 
If f(co) is square integrable, then the Titchmarsh theorem has the third statement 
as a consequence and we have dispersion relations. 


EXERCISES 

7.3.1 The function f(z) satisfies the conditions for the dispersion relations. In addition, 

f(z ) — the Schwarz reflection principle, Section 6.5. Show that j\z) is 

identically zero. 

7.3.2 For f(z) such that we may replace the closed contour of the Cauchy integral 
formula by an integral over the real axis we have 


7 Refer to E. C. Titchmarsh, Introduction to the Theory of Fourier Integrals , 
2nd ed., New York: Oxford University Press 1937. For a more informal 
discussion of the Titchmarsh theorem and further details on causality see 
J. Hilgevoord, Dispersion Relations and Causal Description. Amsterdam: 
North-Holland Publishing Co. (1962). 
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fix o) = 


2ni 


fix) 


dx 4- 


fix) 


dx} - h 


J_f M 

% c, 


dx. 


x — x 0 


Here C Xq designates a small semicircle about x 0 in the lower half-plane. Show that 
this reduces to 


/(*o) 


=ip r 

™ J-oo X ~ X 0 


which is Eq. 7.69. 


7 . 3.3 (a) For f(z) = e iz , Eq. 7.66 does not hold at the end points, argz = 0, n. Show, 
with the help of Jordan’s lemma, Section 7.2, that Eq. 7.67 still holds. 

(b) For f(z) = e lz verify the dispersion relations, Eq. 7.71 or Eqs. 7.75 and 7.76, 
by direct integration. 


7 . 3.4 With f(x) = u(x) 4- iv{x ) and /(x) = /*( — x), show that as x 0 -► oo, 

2 r 

(a) w(x 0 ) ~ y xv{x)dx , 

7TX 0 Jo 

2 f 00 

(b) i^(x 0 ) ^ u(x)dx. 

7TXq Jo 

In quantum mechanics relations of this form are often called sum rules. 


7 . 3.5 (a) Given the integral equation 


(b) 

(c) 

(d) 


1 

1 + x 2 0 




u(x) 
x — x 0 


dx, 


use Hilbert transforms to determine u(x 0 ). 

Verify that the integral equation of part (a) is satisfied. 

From f(z) \ y=0 = u(x) 4- iv(x ), replace x by z and determine /(z). Verify that 
the conditions for the Hilbert transforms are satisfied. 

Are the crossing conditions satisfied? 


ANS. (a) w(x 0 ) = - 


(i + xiy 

(c) f(z) = (z + I) \ 


7 . 3.6 (a) If the real part of the complex index of refraction (squared) is constant (no 
optical dispersion), show that the imaginary part is zero (no absorption). 

(b) Conversely, if there is absorption, show that there must be dispersion. In 
other words, if the imaginary part of n 2 — 1 is not zero, show that the real 
part of n 2 — 1 is not constant. 


7 . 3.7 


Given u(x) — x/(x 2 4- 1) and v(x) = — l/(x 2 + 1), show by direct evaluation of each 
integral that 

'OO f* 00 

|w(x)| 2 dx= \v(x)\ 2 dx. 

J — 00 J — 00 


ANS. 


'00 /»00 

\u{x)\ 2 dx = I \v{x)\ 2 dx = 

J - 00 J - 00 


7 . 3.8 


Take m(x) = ^(x), a delta function, and assume that the Hilbert transform equations 
hold. 


(a) Show that 


&(w) = 


J_ r dy 
n 2 j-^yiy- w)' 
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(b) 


With changes of variables w = s — f and x = s — y, transform the d representa- 
tion of part (a) into 


^(s-f) = 4r 


dx 

(x — s)(x — t ) 


Note. The S function is discussed in Section 8.7. 


7 . 3.9 Show that 


<5(x) = 


1 


dt 


t(t - x) 


is a valid representation of the delta function in the sense that 

f(x)S(x)dx=f( 0). 


Assume that f(x) satisfies the condition for the existence of a Hilbert transform. 
Hint. Apply Eq. 7.69 twice. 


7.4 THE METHOD OF STEEPEST DESCENTS 


In analyzing problems in mathematical physics, one often finds it desirable 
to know the behavior of a function for large values of the variable, that is, 
the asymptotic behavior of the function. Specific examples are furnished by the 
gamma function (Chapter 10) and the various Bessel functions (Chapter 11). 
The method of steepest descents is a method of determining such asymptotic 
behavior when the function can be expressed as an integral of the general form 


I(s) = 


J g(z)e sf{z) dz. 


(7.85) 


For the present, let us take s to be real. The contour of integration C is then 
chosen so that the real part of f(z) approaches minus infinity at both limits and 
that the integrand will vanish at the limits, or is chosen as a closed contour. 
It is further assumed that the factor g(z) in the integrand is dominated by the 
exponential in the region of interest. 

If the parameter s is large and positive, the value of the integrand will become 
large when the real part of f(z) is large and small when the real part of f(z) is 
small or negative. In particular, as s is permitted to increase indefinitely (leading 
to the asymptotic dependence), the entire contribution of the integrand to the 
integral will come from the region in which the real part of f(z) takes on a 
positive maximum value. Away from this positive maximum the integrand will 
become negligibly small in comparison. This is seen by expressing /(z) as 


f(z) = u(x,y) + iv(x, y). 


Then the integral may be written as 


/(s) = 


J g(z)e su>x ’ y) e isv(x ’ y> dz. 


(7.86) 


If now, in addition, we impose the condition that the imaginary part of the 
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exponent, iv(x , y\ be constant in the region in which the real part takes on its 
maximum value, that is, v(x,y) = v(x 0 ,y 0 ) = v 0 , we may approximate the 
integral by 

J(s) * e isy ° J g(z)e su{x ' y) dz. (7.87) 

Away from the maximum of the real part, the imaginary part may be permitted 
to oscillate as it wishes, for the integrand is negligibly small and the varying 
phase factor is therefore irrelevant. 

The real part of s/(z) is a maximum for a given s when the real part of /(z), 
u(x,y\ is a maximum. This implies that 

du _ du _ q 
dx dy 

and therefore, by use of the Cauchy-Riemann conditions of Section 6.2 


dm = Q 

dz 


(7.88) 


We proceed to search for such zeros of the derivative. 

It is essential to note that the maximum value of u(x,y) is the maximum 
only along a given contour. In the finite plane neither the real nor the imaginary 
part of our analytic function possesses an absolute maximum. This may be 
seen by recalling that both u and v satisfy Laplace’s equation 


8 2 u d^u _ 
dy 2 ~ 


(7.89) 


From this, if the second derivative with respect to x is positive, the second 
derivative with respect to y must be negative, and therefore neither u nor v 
can possess an absolute maximum or minimum. Since the function /(z) was 
taken to be analytic, singular points are clearly excluded. The vanishing of the 
derivative (Eq. 7.88) then implies that we have a saddle point, a stationary 
value, which may be a maximum of u(x,y) for one contour and a minimum 
for another (Fig. 7.19). 

Our problem, then, is to choose the contour of integration to satisfy two 
conditions. (1) The contour must be chosen so that u(x,y) has a maximum at 
the saddle point. (2) The contour must pass through the saddle in such a way 
that the imaginary part, v(x,y\ is a constant. This second condition leads to 
the path of steepest descent and gives the method its name. From Section 6.2, 
especially Exercise 6.2. 1, we know that the curves corresponding to u — constant 
and v = constant form an orthogonal system. This means that a curve v = c f , 
constant, is everywhere tangential to the gradient of w, \u. Hence the curve 
v = constant is the curve that gives the line of steepest descent from the saddle 
point. 1 


J The line of steepest ascent is also characterized by constant v. The saddle 
point must be inspected carefully to distinguish the line of steepest descent 
from the line of steepest ascent. This is discussed later in two examples. 
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u(x,y) 



FIG. 7.19 A saddle point 

At the saddle point the function /(z) can be expanded in a Taylor series to 
give 

f{z) = /(z 0 ) 4- }(z - z 0 ) 2 /"(z 0 ) + • • • • (7.90) 

The first derivative is absent, since obviously Eq. 7.88 is satisfied. The first 
correction term, \ (z — z 0 ) 2 f"(z 0 \ is real and negative. It is real, for we have 
specified that the imaginary part shall be constant along our contour and 
negative because we are moving down from the saddle point or mountain 
pass. Then, assuming that f"(z 0 ) =/= 0, we have 

M - f(z o) * i(z - z 0 ) 2 /"(z 0 ) = ~~t 2 , (7.91) 

which serves to define a new variable t. If (z — z 0 ) is written in polar form 

(z - z 0 ) = Se* (7.92) 

(with the phase a held constant), we have 

t 2 = -~sf"(z 0 )S 2 e 2ia . (7.93) 

Since t is real, 2 it may be written as 

t = ±^|s/"(z 0 )| 1/2 . (7.94) 

Substituting Eq. 7.91 into Eq. 7.85, we obtain 


2 The phase of the contour (specified by a) at the saddle point is chosen so 
that — f{z 0 )] = 0, that is, \{z — z 0 ) 2 f"{z 0 ) must be real. 
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We have 


I{s) x g(z 0 )e sfiz ‘> ) 


? ' 2/2 (it. 

dt 


(7.95) 


dz _ ( dt\ 1 
dt \dz J 


dt_ dd V 1 
dd dz ) 


| S /"(z 0 )|" 1/2 e“, 


from Eqs. 7.92 and 7.94. Equation 7.95 becomes 


(7.96) 


I(s) 


g(z 0 )e s/(z °y a 

\sf"(zo )\ V2 



(7.97) 


It will be noted that the limits have been set as minus infinity to plus infinity. This 
is permissible, for the integrand is essentially zero when t departs appreciably 
from the origin. Noting that the remaining integral is just a Gauss error integral 
equal to we finally obtain 


/(s) “bf'M® " 


(7.98) 


The phase a was introduced in Eq. 7.92 as the phase of the contour as it passed 
through the saddle point. It is chosen so that the two conditions given [a = 
constant; &f(z) = maximum] are satisfied. It sometimes happens that the con- 
tour passes through two or more saddle points in succession. If this is the case, 
we need only add the contribution made by Eq. 7.98 from each of the saddle 
points in order to get an approximation for the total integral. 

One note of warning: We assumed that the only significant contribution to 
the integral came from the immediate vicinity of the saddle point(s) z = z 0 , that 
is, 


^[/(z)] = « (x,y) < u(x 0 ,y 0 ) 

over the entire contour away from z 0 = x 0 + iy 0 . This condition must be 
checked for each new problem (Exercise 7.4.5). 


EXAMPLE 7.4.1 Asymptotic Form of the Hankel Function, H { v l) (s) 


In Section 11.4 it is shown that the Hankel functions, which satisfy Bessel’s 
equation, may be defined by 


711 

Til 


(s/2)(z-l/z) 


dz 


OCj 

0 


■ ooC 7 


a (s/2)(z — l/z) 


dz 


(7.99) 

(7.100) 


The contour C 1 is the curve in the upper half-plane of Fig. 7.20. The contour C 2 
is in the lower half-plane. We apply the method of steepest descents to the first 
Hankel function, 7/ v {1) (s), which is conveniently in the form specified by Eq. 7.85, 
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V 



FIG. 7.20 Hankel func- 
tion contours 


with f(z) given by 

/(z)= 4( z ~z) (7l01) 

By differentiating, we obtain 

/,(z) = l + 2?' (7 ' 102) 

Settin g/'(z) = 0 in accordance with Eq. 7.88, we obtain 

z ~ i, -~L (7.103) 

Hence there are saddle points at z = + / and z = —i. The integral for H [ v 1) (s) is 
chosen so that it starts at the origin, moves out tangentially to the positive real 
axis, and then moves around through the saddle point at z = + i and on out to 
minus infinity, asymptotic with the negative real axis. We must choose the con- 
tour through the point z = +i in such a way that the real part of (z — 1/z) will be 
a maximum and the phase will be constant in the vicinity of the saddle point. 
We have 

01 ^z — = 0 for z = i. 

We require M(z — 1/z) < 0 for the rest of C l9 (z # i). 

In the vicinity of the saddle point at z 0 = T i we have 

z — i — Se i<x 9 (7.104) 

where <3 is a small number. Then 

2/(z) = z — - = Se l0C + i - - J — - 
J z be 1 * -h i 


= 3 cos a + i(b sin a -F 1) — 


1 

b cos a + i(b sin a T 1) 


(7.105) 


= b cos a + i(b sin a -F 1) — 


b cos a — z'(<5 sin a -F 1) 
1 + 2 b sin a + b 2 


Therefore our real part becomes 
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M\z j = <5 cos a — S cos a(l + 2 S sin a 4- S 2 ) L 


(7.106) 


Recalling that S is small, we expand by the binomial theorem and neglect terms 
of order d 3 and higher. 


z \ — 2S 2 cos a sin a + 0(<5 3 ) « d 2 sin 2a. 


(7.107) 


We see that the real part of ( z — 1/z) will take on an extreme value if sin 2a is 
an extremum, that is, if 2a is n/2 or 37r/2. Hence the phase of the contour a should 
be chosen to be n/4 or 37i/4. One choice will represent the path of steepest descent 
that we want. The other choice will represent a path of steepest ascent that we 
must avoid. We distinguish the two possibilities by substituting in the specific 
values of a. For a = xc/4 


'-i H 1 


For this choice z — i is a minimum. 
For a = 37 t/4 


= 


and z — i is a maximum. This is the phase we want. 

Direct substitution into Eq. 7.98 with a = 37i/4 now yields 


(7.108) 


(7.109) 


_ J_ ■ v /27d~ v ~ 1 g W2)<l ~ 1/l) g‘ (3,r/4> 
ni |(-s/2)( — 2/f 3 )j 1/2 


2_g(iit/2)(-v-2) e is^i(3it/4) 


(7.110) 


By combining terms, we finally obtain 


m"(s) 


^ - v(rr/2 ) - tt/4 ) 


(7.111) 


as the leading term of the asymptotic expansion of the Hankel function H^ 1 ) (s). 
Additional terms, if desired, may be picked up by assuming a series of descending 
powers and substituting back into Bessel’s equation. 

EXAMPLE 7.4.2 Asymptotic Form of the Factorial Function, s ! 


In many physical problems, particularly in the field of statistical mechanics, 
it is desirable to have an accurate approximation of the gamma or factorial 
function of very large numbers. As developed in Section 10.1, the factorial 
function may be defined by the integral 


s! = p s e P dp = s s+1 e s ( lnz Z ^dz, 


(7.112) 
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Here we have made the substitution p — zs in order to throw the integral into the 
form required by Eq. 7.85. As before, we assume that s is real and positive, from 
which it follows that the integrand vanishes at the limits 0 and oo. By differen- 
tiating the z-dependence appearing in the exponent, we obtain 

® = ^-(lnz-z) = t_ i, (7.113) 

dz dz z 


which shows that the point 2 = 1 is a saddle point. We let 

z~l = Se ioc , (7.114) 


with <5 small to describe the contour in the vicinity of the saddle point. Sub- 
stituting int 0 /( 2 ) = In 2 — 2 , we develop a series expansion 

f(z) = ln(l + <50 - (1 + <50 

= Se ia - }S 2 e 2ia + • • • - 1 - 3e ict (7.115) 

= -1 - jS 2 e 2i «. 


From this we see that the integrand takes on a maximum value (e~ s ) at the saddle 
point if we choose our contour C to follow the real axis, a conclusion that the 
reader may well have reached more or less intuitively. 

Direct substitution into Eq. 7.98 with a = 0 now gives 


, J2ns s + l e s 

c ! ^ _jl_ 

s '~|s(-i-T 2 


(7.116) 


Thus the first term in the asymptotic expansion of the factorial function is 


s! « ^J2%ss s e s . 


(7.117) 


This result is the first term in Stirling’s expansion of the factorial function. The 
method of steepest descent is probably the easiest way of obtaining this first 
term. If more terms in the expansion are desired, then the method of Section 
10.3 is preferable. 

In the foregoing example the calculation was carried out by assuming s to be 
real. This assumption is not necessary. The student may show (Exercise 7.4.6) 
that Eq. 7.117 also holds when s is replaced by the complex variable w, provided 
only that the real part of w is required to be large and positive. 


EXERCISES 


7 . 4.1 


Using the method of steepest descents, evaluate the second Hankel function given 
by 



_(s/2)(z-l/z). 


dz 


with contour C 2 as shown in Fig. 7.20. 



i(s — Jt/4 — vji/2) 


ANS. H ( , 2 \s) * 
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7 . 4.2 The negative square root in Eq. 7.94 does not appear in Eq. 7.97. What is the 
justification for dropping it? Illustrate your argument by detailed reference to 
H?\z\ Example 7.4.1. 


7 . 4.3 (a) In applying the method of steepest descent to the Hankel function 
show that 

-*[/(z)] < i»lAz 0 )] = 0 


for z on the contour Cj but away from the point 2 = z 0 = i. 
(b) Show that 


#[/(*)] > 0 


for 


0 < r < 1, 


71 

~ < 0 < n 
2 

-7 1 < 0 < -- 
2 


and for r > 1, — - < 0 < - 

(Fig. 7.21). 2 2 

This is why C l may not be deformed to pass through the second saddle point 
2 = —i. 



FIG. 7.21 


7 . 4.4 


Determine the asymptotic dependence of the modified Bessel functions / v (x), 
given 


I v (x) 


_ 1 (x/2)(t+l/t) dj 

2ni ^ c t v+1 


The contour starts and ends at t ~ — 00 , encircling the origin in a positive sense. 
There are two saddle points. Only the one at 2 = -hi contributes significantly to 
the asymptotic form. 


7 . 4.5 Determine the asymptotic dependence of the modified Bessel function of the 
second kind, K v (x ), by using 

K v ( X ) = lf !/.)_* 

2 Jo s 

7 . 4.6 Show that Stirling’s formula 

s! * *j2nss s e~ s 

holds for complex values of s (with M{s) large and positive). 

Hint. This involves assigning a phase to s and then demanding that J[sf(z)] = 
constant in the vicinity of the saddle point. 
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7 . 4.7 Assume to have a negative power-series expansion of the form 

= /A e « s -»W2>-*/4) y a_ n 5-", 

V ® n=0 

with the coefficient of the summation obtained by the method of steepest descent. 
Substitute into Bessel’s equation and show that you reproduce the asymptotic 
series for H[ 1} (s) given in Section 11.6. 
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8 DIFFERENTIAL 
EQUATIONS 


8.1 PARTIAL DIFFERENTIAL EQUATIONS OF 
THEORETICAL PHYSICS 

Almost all the elementary and numerous advanced parts of theoretical 
physics are formulated in terms of differential equations, often partial differential 
equations. Among the most frequently encountered are the following: 

1. Laplace’s equation, \ 2 ij/ = 0. 

This very common and very important equation 
occurs in studies of 

a. electromagnetic phenomena including electro- 
statics, dielectrics, steady currents, and magne- 
tostatics, 

b. hydrodynamics (irrotational flow of perfect 
fluid and surface waves), 

c. heat flow, 

d. gravitation. 

2. Poisson’s equation, V 2 t jt — — p/s 0 ' 

In contrast to the homogeneous Laplace equation, 

Poisson’s equation is nonhomogeneous with a 
source term — p/e 0 . 

3. The wave (Helmholtz) and time-independent diffu- 
sion equations, V 2 i/f ± = 0. 

These equations appear in such diverse phenomena 
as 

a. elastic waves in solids including vibrating 
strings, bars, membranes, 

b. sound or acoustics, 

c. electromagnetic waves, 

d. nuclear reactors. 

4. The time-dependent diffusion equation 


and the corresponding four-dimensional forms 
involving the d’Alembertian, a four-dimensional 

437 
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analog of the Laplacian in Minkowski space, 

r ) 2 ?) 2 r ) 2 ?\ 2 c \ 2 

£] 2 = \^2 j — = 1 — 1 — L . 

dxl dx 2 dy 2 dz 2 (ic) 2 dt 2 

5. The time-dependent wave equation, d 2 i j/ = 0. 

6. The scalar potential equation, u 2 \j/ = ~p/s 0 . 

Like Poisson’s equation this equation is non- 
homogeneous with a source term — p/s 0 . 

7. The Klein-Gordon equation, a 2 \j/ = and the 
corresponding vector equations in which the scalar 
function ^ is replaced by a vector function. 

Other more complicated forms are common. 

8. The Schrodinger wave equation, 

h 2 2 i , t / / •> # 

it + V\\j = ih-~- 
2m Y Y dt 

and 

h 1 

+ V\j/ - Eij/ 

2m 

for the time-independent case. 

9. The equations for elastic waves and for viscous 
fluids and the telegraphy equation. 

10. Maxwell’s coupled partial differential equations for 
electric and magnetic fields and those of Dirac for 
relativistic electron wave functions. For Maxwell’s 
equations see the Introduction and also Section 1.9. 

All these equations can be written in the form 

H\j/ ~ F, 

in which H is a differential operator, 

rr/8 d d 8 \ 

F is a known function, and \j/ is the unknown scalar (or vector) function. 
Two characteristics are particularly important: 

1. All these equations are linear 1 in the unknown func- 
tion if/. As the easier physical and mathematical 
problems are being solved, nonlinear differential 
equations such as those describing shock wave 
phenomena are receiving more and more attention. 

The fundamental equations of atmospheric physics 


Compare Section 2.6 for definition of linearity. 
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are nonlinear. Turbulence, perhaps the most impor- 
tant unsolved problem of classical physics, is basi- 
cally nonlinear. However, both the nonlinear differ- 
ential equations themselves and the numerical 
techniques to which we often resort for determining 
solutions are beyond the scope of this book. 

2. These equations are all second-order differential 
equations [Maxwell’s and Dirac’s equations are 
first-order but involve two unknown functions. 
Eliminating one unknown yields a second-order 
differential equation for the other (compare Section 
1.9).] 


Occasionally, we encounter equations of higher order. In both the theory of 
the slow motion of a viscous fluid and the theory of an elastic body we find the 
equation 


(V 2 ) 2 */' 


'j? 2 a 4 

dx 4 dx 2 dy 2 



tfr = 0. 


Fortunately, for introductory treatments such as this one these higher-order 
differential equations are relatively rare. 

Although not so frequently encountered and perhaps not so important as 
second-order differential equations, first-order differential equations do appear 
in theoretical physics. The solutions of some of the more important types of 
first-order (ordinary) equations are developed in Section 8.2. 

Some general techniques for solving the partial differential equations are 
discussed in this section. 

1. Separation of variables. The partial differential equa- 
tion is split into ordinary differential equations that 
may be attacked by Frobenius’s method, Section 8.5. 

This separation technique is introduced in Section 
2.6 and is discussed further in Section 8.3. It does not 
always work but is often the simplest method when 
it does. 

2. Integral solutions employing a Green’s function. An 
introduction to the Green’s function technique is 
given in Section 8.7. A more detailed treatment 
appears in Chapter 16. 

3. Other analytical methods such as the use of integral 
transforms. Some of the techniques in this class are 
developed and applied in Chapter 15. 

4. Numerical calculations. The development of modern 
high-speed computing machines has opened up a 
wealth of possibilities based on the calculus of finite 
differences. Here we have the relaxation methods. 

In Section 8.8 two numerical methods, the Runge- 
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Kutta and a predictor-corrector are applied to 
ordinary differential equations. 2 


8.2 FIRST-ORDER DIFFERENTIAL EQUATIONS 


Physics involves some first-order differential equations. For completeness 
(and possible review) it seems desirable to touch on them briefly. 

We consider here differential equations of the general form 


dx 


P(x,y) 

8 (*, y Y 


( 8 . 1 ) 


Equation 8.1 is clearly a first-order, ordinary differential equation. It is first- 
order because it contains the first and no higher derivatives. Ordinary because 
the only derivative dy/dx is an ordinary or total derivative. Equation 8.1 may or 
may not be linear , although we shall treat the linear case explicitly later, Eq. 
8 . 10 . 


Separable Variables 

Frequently Eq. 8.1 will have the special form 


dy 

dx 


=f(x,y) 


P(x) 

Q(yY 


(8.2) 


Then it may be rewritten as 

P(x)dx + Q{y)dy = 0. 

Integrating from (x 0 ,y 0 ) to (x,y) yields 

f P(x)dx + f Q(y)dy~0 . (8.3) 

J * 0 Jy 0 

Since the lower limits x 0 and y 0 contribute constants, we may ignore the lower 
limits of integration and simply add a constant of integration. Note that this 
separation of variables technique does not require that the differential equation 
be linear. 


EXAMPLE 8.2.1 Boyle’s Law 

In differential form Boyle’s gas law is 

dff = _V_ 
dP~ P 

for the volume V of a fixed quantity of gas at pressure P (and constant tempera- 
ture). Separating variables, we have 


2 For further details of numerical computation the reader could start with 
R. W. Hamming’s Numerical Methods for Scientists and Engineers. New Y ork : 
McGraw-Hill (1973) and proceed to specialized references. 
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dV_ _dP^ 
V ~ P 


or 


\nV= — In P + C. 

With two logarithms already, it is most convenient to rewrite the constant of 
integration C as In k. Then 

In F -h In P = In PF = In /c 


and 


PV — k. 


Exact Differential Equations 

We rewrite Eq. 8.1 as 

P(x,y)dx + Q(x,y)dy = 0. (8.4) 

This equation is said to be exact if we can match it to a differential dcp , 

dtp = ^-dx + ~~dy. (8.5) 

ox dy 

Since Eq. 8.4 has a zero on the right, we look for an unknown function cp(x , y) — 
constant and dcp = 0. 

We have (if such a function cp(x , j/) exists) 

P(x,y)dx + Q(x,y)dy = ^~dx + ~~dy (8.6) 

and 

| P - = p (x,y\ = Q(x,y). (8.7) 

The necessary and sufficient condition for our equation to be exact is that the 
second, mixed partial derivatives of cp(x , y) (assumed continuous) are indepen- 
dent of the order of differentiation : 

Syp _ dPj^y) _ dQ(x,y) = d^cp g) 

dydx dy dx dxdy 

Note the resemblance to the equations of Section 1.13, “Potential Theory.” If 
Eq. 8.4 corresponds to a curl (equal to zero), then a potential, cp(x,y\ must exist. 
If cp(x,y) exists then from Eqs. 8.4 and 8.6 our solution is 

cp(x,y) = C. (8.9) 

We may construct cp(x,y) from its partial derivatives just as we constructed a 
magnetic vector potential in Section 1.13 from its curl. 

It may well turn out that Eq. 8.4 is not exact, that Eq. 8.8 is not satisfied. 
However, there always exists at least one and perhaps an infinity of integrating 
factors, oc(x,y\ such that 
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ot(x,y)P(x,y)dx + a(x, y)Q(x,y)dy = 0 

is exact. Unfortunately, an integrating factor is not always obvious or easy to 
find. Unlike the case of the linear first-order differential equation to be con- 
sidered next, there is no systematic way to develop an integrating factor for 
Eq. 8.4. 

A differential equation in which the variables have been separated is auto- 
matically exact. An exact differential equation is not necessarily separable. 

Linear First-order Differential Equations 

If /(x,y) in Eq. 8.1 has the form — p{x)y + q(x\ then Eq. 8.1 becomes 

~ + p{x)y = q(x). (8.10) 

Equation 8.10 is the most general linear first-order differential equation. If 
q{x) — 0, Eq. 8.10 is homogeneous (in y). A nonzero q(x) may represent a source 
or a driving term. Equation 8.10 is linear ; each term is linear in y or dy/dx. There 
are no higher powers; that is, y 2 , and no products, y(dy/dx). Note that the 
linearity refers to the y and dy/dx ; p(x) and q(x) need not be linear in x. Equation 
8.10, the most important of these first-order differential equations for physics, 
may be solved exactly. 

Let us look for an integrating factor a(x) so that 

ol(x)~- + ct(x)p(x)y = ot(x)q{x) (8.11) 

dx 

may be rewritten as 

~[<x(x)y] = z(x)q(x). (8.12) 

The purpose of this is to make the left-hand side of Eq. 8.10 a derivative so that 
it can be integrated — by inspection. It also, incidentally, makes Eq. 8.10 exact. 
Expanding Eq. 8.12, we obtain 

a ( X )^X + = 

Comparison with Eq. 8.11 shows that we must require 

^ = a (x)p(x). (8.13) 


Here is a differential equation for a(x), with the variables a and x separable. We 
separate variables, integrate, and obtain 


a(x) = exp 


p(x) dx 


(8.14) 


as our integrating factor. 

With a(x) known we proceed to integrate Eq. 8.12. This, of course, was the 
point of introducing a in the first place. We have 
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* d 


— [a(x)y(x)] dx = | oc(x)q(x)dx . 


Now integrating by inspection, we have 

a(x)y(x) = f tx(x)q(x)dx -h C. 


The constants from a constant lower limit of integration are lumped into the 
constant C. Dividing by a(x), we obtain 


y(x) = [a(x)] 1 


ot(x)q(x)dx -f C 


Finally, substituting in Eq. 8.14 for a yields 


y(x) = exp 


p(t)dt 


exp 


p(t)dt L(s)ds + cl (8.15) 


Here the (dummy) variables of integration have been rewritten to make them 
unambiguous. Equation 8.15 is the complete general solution of the linear, 
first-order differential equation, Eq. 8.10. The portion 

rx 


Vj(x) = Cexp 


p(t)dt 


(8.16) 


corresponds to the case q(x) = 0 and is a general solution of the homogeneous 
differential equation. The other term in Eq. 8.15, 


y 2 (x) = exp 


p(t)dt 


exp 


p(t)dt 


q{s)ds , 


(8.17) 


is a particular solution corresponding to the specific source term g(x). 

The reader might note that if our linear first-order differential equation is 
homogeneous (q = 0), then it is separable. Otherwise, apart from special cases 
such as p = constant, q = constant, or g(x) = ap(x), Eq. 8.10 is not separable. 

EXAMPLE 8.2. 1 RL Circuit 


For a resistance-inductance circuit Kirchhoff’s law leads to 

L ^f + RI(t) = m 

for the current I{t\ where L is the inductance and R the resistance, both constant. 
V(t) is the time-dependent impressed voltage. 

From Eq. 8.14 our integrating factor oc(t) is 


a(0 = exp 


R 


dt 



444 DIFFERENTIAL EQUATIONS 


Then by Eq. 8.15 


I(t) = e 




[f 


e iW.V(f) dt + c 


with the constant C to be determined by an initial condition (a boundary 
condition). 

For the special case V(t ) = V 0 , a constant, 


I(t) = e~ Rt/L 


~Vo L 
L R 


e RtlL + C 


- Yo , Ce~ R t/L 


If the initial condition is 7(0) = 0, then C = — V 0 /R and 


Conversion to Integral Equation 

Our first-order differential equation, Eq. 8.1, may be converted to an integral 
equation by direct integration : 

y(x) - y(x 0 ) = f f[x,y(x)]dx. (8.18) 

Jx 0 

As an integral equation there is a possibility of a Neumann series solution (Sec- 
tion 16.3) with the initial approximation y(x) « j>(x 0 ). In th e differential equation 
literature this is called the “Picard method of successive approximations.” 



FIG. 8.1 


The relationships among the various techniques introduced in this section 
are shown in Fig. 8.1. 

First-order differential equations will be encountered again in Chapter 15 in 
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connection with Laplace transforms and in Chapter 17 from the Euler equation 
of the calculus of variations. Numerical techniques for solving first-order dif- 
ferential equations are examined in Section 8.8. 


EXERCISES 

8 . 2.1 From Kirchhoff’s law the current / in an RC (resistance-capacitance) circuit 
(Fig. 8.2) obeys the equation 


(a) Find I(t). 

(b) For a capacitance of 10,000 microfarads charged to 100 volts and discharging 
through a resistance of 1 megohm, find the current / for t = 0 and for 
t — 100 seconds. 

Note. The initial voltage is I 0 R or Q/C, where Q = Jo I(t)dt. 



FIG. 8.2 RC circuit 


8 . 2.2 The Laplace transform of Bessel’s equation (n — 0) leads to 

(s 2 + 1 )f'(s) + s/(s) = 0. 

Solve for f(s). 

8 . 2.3 The decay of a population by catastrophic two body collisions is described by 


dN 

dt 


= -kN 2 . 


This is a first-order, nonlinear differential equation. Derive the solution 


iV(r) = JV 0 1 + 


where r 0 = (kN 0 ) l . This implies an infinite population at l = — t 0 . 

ANS. N(t) = N 0 [ 1 + 


8 . 2.4 The rate of a particular chemical reaction A + B -► C is proportional to the 
concentrations of the reactants A and B : 


dC(t) 

dt 


= a[^4(0) — C(t)][f?(0) — C(f)]. 
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(a) Find C(t) for 4(0) + B( 0). 

(b) Find C(r) for 4(0) = £(0). 

The initial condition is that C(0) = 0. 

8.2.5 A boat, coasting through the water, experiences a resisting force proportional to 
if, v being the boat’s instantaneous velocity. Newton’s second law leads to 

dv . „ 
w— = — kv. 
dt 

With v(t = 0) = v 0 , x(t = 0) = 0, integrate to find v as a function of time and v 
as a function of distance. 


8.2.6 In the first-order differential equation dy/dx = f(x,y) the function f(x,y) is a 
function of the ratio y/x: 

dy 


dx 


= g(y/x). 


Show that the substitution of u = y/x leads to a separable equation in u and x. 

8.2.7 The differential equation 

P{x,y)dx + Q(x,y)dy = 0 
is exact. Construct a solution 


rx r>y 

(p(x,y)= P(x,y)dx + 

Jx n Jy n 


P(x,y)dx + Q(xo,y)dy ~ constant. 
^0 


8.2.8 The differential equation 


P(x, y) dx + Q(x,y)dy = 0 


is exact. If 


show that 


<p(x,y)= f P{x,y)dx + f 
Jx n J y n 


P(x,y)dx+ Q(x 0 ,y)dy, 


^ = P(x,y), 
dx 


dep 

dy 


- Qix.y). 


Hence (p(x,y ) = constant is a solution of the original differential equation. 

8.2.9 Prove that Eq. 8.11 is exact in the sense of Eq. 8.8, provided that a(x) satisfies 
Eq. 8.13. 


8.2.10 A certain differential equation has the form 

f(x)dx + g(x)h{y)dy = 0, 

with none of the functions /(x), $(x), h(y) identically zero. Show that a necessary 
and sufficient condition for this equation to be exact is that #(x) = constant. 


8.2.11 Show that 


y(x) = exp 


p(t)dt 


exp 


p(t)dt 


q(s)ds -h C 


is a solution of 
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— + p(x)y(x) = q(x) 

by differentiating the expression for y(x) and substituting into the differential 
equation. 

8.2.12 The motion of a body falling in a resisting medium may be described by 

dv . 

m— = mq — bv 
dt 

when the retarding force is proportional to the velocity, v. Find the velocity. 
Evaluate the constant of integration by demanding that r(0) = 0. 

8.2.1 3 Radioactive nuclei decay according to the law 

— = — AN. 
dt 

N being the concentration of a given nuclide and A, the particular decay constant. 
In a radioactive series of n different nuclides, starting with N t , 

dN 2 n m T 'AT A 

— AiiVi — /. 2 JV 2 , and so on. 
dt 2 

Find N 2 (t) for the conditions A/^O) = N 0 and N 2 ( 0) == 0. 

8.2.14 The rate of evaporation from a particular spherical drop of liquid (constant 
density) is proportional to its surface area. Assuming this to be the sole mechanism 
of mass loss, find the radius of the drop as a function of time. 

8.2.1 5 In the linear homogeneous differential equation 


the variables are separable. When the variables are separated the equation is 
exact. Solve this differential equation subject to v(0) — v 0 by the following three 
methods: 

(a) Separating variables and integrating. 

(b) Treating the separated variable equation as exact. 

(c) Using the result for a linear homogeneous differential equation. 

ANS. v(t) = v Q e~ at . 


8.2.16 Bernoulli’s equation, 


■ f{x)y = g(x)y n 


is nonlinear for n 0 or 1. Show that the substitution u = y 1 " reduces Bernoulli’s 
equation to a linear equation. 

ANS. ~ + (1 - n)J(x)u = (1 - n)g(x). 


8.2.1 7 Solve the linear, first-order equation, Eq. 8.10, by assuming y(x) = u(x)u(x), where 
u(x) is a solution of the corresponding homogeneous equation [g(x) = 0]. This 
is the method of variation of parameters due to Lagrange. We apply it to second- 
order equations in Exercise 8.6.25. 
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8.3 SEPARATION OF VARIABLES— ORDINARY 
DIFFERENTIAL EQUATIONS 


The equations of mathematical physics listed in Section 8.1 are all partial 
differential equations. Our first technique for their solution splits the partial 
differential equation of n variables into n ordinary differential equations. Each 
separation introduces an arbitrary constant of separation. If we have n variables, 
we have to introduce n — 1 constants, determined by the conditions imposed in 
the problem being solved. 

In Section 2.6 the technique of separation of variables was illustrated for the 
wave equation in cartesian, circular cylindrical, and spherical polar coordinates. 
In the spherical polar coordinate system the wave equation 

+ = 0 (8.19) 


led to an azimuthal equation 

+ m 2 0>(<p) = 0, (8.20) 

dip 

in which — m 2 is a separation constant. As an illustration of how the constant is 
restricted, we note that cp in spherical polar coordinates is an azimuth angle. If 
this is a classical problem, we shall certainly require that the azimuthal solution 
<L >(<p) be single-valued, that is, 

<b((p + 2 n) = <3>(<p). (8.21) 


This is equivalent to requiring the azimuthal solution to have a period of 2n or 
some integral multiple of it. 1 Therefore m must be an integer. Which integer it is 
depends on the details of the problem. This is discussed in Chapter 9. Whenever 
a coordinate corresponds to an axis of translation or to an azimuth angle the 
separated equation always has the form 


d 2 ^(cp) 

dip 2 


— m 2 Q>((p) 


for; cp , the azimuth angle, and 

^P=±a 2 Z(z) ( 8 . 22 ) 

dz z 

for z, an axis of translation in one of the cylindrical coordinate systems. The 
solutions, of course, are sin az and cos az for —a 2 and the corresponding hyper- 
bolic function (or exponentials) sinhaz and cosh az for A- a 2 . 

The Legendre equation, 


1 This also applies in most quantum mechanical problems but the argument 
is much more involved. If m is not an integer, rotation group relations (Section 
4.9) and ladder operator relations (Section 12.7) are disrupted. Compare 

E. Merzbacher, “Single Valvedness of Wave Functions.” Am. J. Phys. 30, 
237 (1962). 
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1 d 
sin 6 dO 



+ 1(1 + 1)0 = 0 , 


(1 - x 2 ) 


d 2 y 
dx 2 


2x^~- + /(/ 4- 1 )y = 0, 
ax 


(8.23) 


and the associated Legendre equation 
1 


Tft f sin ' ) + W + 1)0 - = °. 

sin 6 dd\ dO I sin z 0 


d 2 y ^ dy 


(8.24) 


(1 " * 2 >5? " 2x 


* + ' (,+ 1)) ’-T 


m 


iy = o , 2 


also appear frequently. As noted in Section 2.6, these equations appear when V 2 
is used in spherical polar coordinates. Prolate and oblate spheroidal coordinates 
also give rise to the Legendre and associated Legendre equations. 

A third equation frequently encountered is Bessel’s differential equation, 



+ x^ + (x 2 -n 2 )y = 0. 
dx 


(8.25) 


In Sections 2.4 and 2.5 circular cylindrical and spherical polar coordinates 
yielded varieties of Bessel’s equation. The separation of variables of Laplace’s 
equation in parabolic coordinates also gives rise to Bessel’s equation. It may be 
noted that the Bessel equation is notorious for the variety of disguises it may 
assume. For an extensive tabulation of possible forms the reader is referred to 
Tables of Functions by Jahnke and Emde. 3 

Other occasionally encountered ordinary differential equations include the 
Laguerre and associated Laguerre equations from the supremely important 
hydrogen atom problem in quantum mechanics : 

x ~r^ + (1 — x )~r + ay = 0, (8.26) 

Ua Wa 

+ (1 + k — x)~ + ay = 0. (8.27) 


From the quantum mechanical theory of the linear oscillator we have Hermite’s 
equation, 


d 2 y 

dx 2 


— 2x^ + 2oty — 0. 
dx 


(8.28) 


Finally, from time to time we find the Chebyshev differential equation 


d2 y ,dy , , 


(l-* 2 )^-x 


dx 


+ n 2 y = 0. 


(8.29) 


2 These are equivalent algebraic forms in which = cos 0. 

3 Fourth revised edition. New York: Dover (1945), p. 146. Also, E. Jahnke, 
F. Emde, and F. Losch, Tables of Higher Functions, 6th ed. New York: 
McGraw-Hill (1960). 
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TABLE 8.1 Solutions in Spherical Polar Coordinates* 

^ = Z «!»></'! m 


1. \ 2 >p = 0 


2 V 2 i 1/ + k 2 <l/ = 0 


3. V 2 1 1/ -k 2 ij/ = 0 


* References for some of the functions are P™(cos6), m = 0, 
Section 12.1 ; m 0, Section 12.5; Q™(cos 0)> Section 12.10; 
ji(kr), n t (kr), i t (kr), and k t (kr ), Section 11.7. 
f cos nup and sinmtp may be replaced by e ±lm(p 


= 

</%, = 
'i'lm = 


i,(kr) 

k,{kr ) 


P"'( cos 0)1 fcosm</>] f 
g, m (cos tf)J | sin mcp j 
J y,(/cr)| J P, m (cos 0)| Jcos m<p| ■ ' 
sinm<pj 
P/”(cos 0)) [cosm</)| t 


U(^)jler(cos^)j 


G/ w (cos 0) j 


smnup 


TABLE 8.2 Solutions in Circular Cylindrical Coordinates 

= Z ‘v/v, v V = ° 

m,a 


a. 

b. 


c. If cl — 0 (no z-dependence) 


>l' l m = 
•Am = 


7,„(ap)| Jcosm<p| 
N„,(ap)j ( sin m<pj 
LMlfcosmtpl 
K,„(ap)j l sin mqj j 
p m ) (cos nup) 
p~ m j l sin nup) 



cos ocz 
sinaz 


* References for the radial functions are J m (oip), Section 11.1 ; N m {oip), 
Section 11.3; I m (ocp) and K m (ocp), Section 11.5. 


For convenient reference, the forms of the solutions of Laplace’s equation, 
Helmholtz’s equation, and the diffusion equation for spherical polar coor- 
dinates are collected in Table 8.1. The solutions of Laplace’s equation in circular 
cylindrical coordinates are presented in Table 8.2. 

For the Helmholtz and the diffusion equation the constant ±k 2 is added to 
the separation constant ±a 2 to define a new parameter y 2 or — y 2 . For the 
choice 4- y 2 (with y 2 > 0) we get J m (yp) and N m (yp\ For the choice — y 2 (with 
y 2 > 0) we get I m (yp) and K m (yp) as previously. 

These ordinary differential equations and two generalizations of them will be 
examined and systematized in the next section. General properties following 
from the form of the differential equations are discussed in Chapter 9. The 
individual solutions are developed and applied in Chapters 10 to 13. 

The practicing physicist may and probably will meet other second-order 
ordinary differential equations, some of which may possibly be transformed into 
the examples studied here. Some of these differential equations may be solved by 
the techniques of Sections 8.5 and 8.6. Others may require a calculating machine 
for a numerical solution. 
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EXERCISES 


8.3.1 The quantum mechanical angular momentum operator is given by L = — i( r x V). 
Show that 

L • Lijj — /(/ + l)i j/ 

leads to the associated Legendre equation. 

Hint. Exercises 1.9.9 and 2.5.16 may be helpful. 

8.3.2 The one-dimensional Schrodinger wave equation for a particle in a potential field 
V = \kx 2 is 

(a) Using £ = ax and a constant X, we have 

( mk\ 11 * 


1/2 


2E fm 

L ~~T\k 


show that 


(b) Substituting 


d 2 m 

di 2 


+ (x- = o. 


*({) = y(i)e- ei2 , 

show that y(£) satisfies the Hermite differential equation. 

8.3.3 Verify that the following are solutions of Laplace’s equation: 

(a) i = 1/r, 

(b) ^=Ln' : — . 

2 r r — z 


8.3.4 If *¥ is a solution of Laplace’s equation, V 2l T = 0, show that d^/dz is also a solu- 
tion. 

Note. The z derivatives of 1/r generate the Legendre polynomials, P„(costf), 
Exercise 12.1.7. The z derivatives of (l/2r) In [(r + z)/(r — z)] generate the Legendre 
functions, G„(cos 6). 


BA SINGULAR POINTS 

In this section the concept of a singular point or singularity (as applied to a 
differential equation) is introduced. The interest in this concept stems from its 
usefulness in (1) classifying differential equations and (2) investigating the fea- 
sibility of a series solution. This feasibility is the topic of Fuchs’s theorem, 
Sections 8.5 and 8.6. First, a definition. 

All the ordinary differential equations listed in Section 8.3 may be solved for 
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d 2 y/dx 2 . Using the notation d 2 y/dx 2 = y'\ we have 1 

/=/(x,M) (8.30) 

Now, if in Eq. 8.30 y and y' can take on all finite values at x = x 0 and y" remains 
finite, point x = x 0 is an ordinary point. On the other hand, if y" becomes in- 
finite for any finite choice of y and /, point x = x 0 is labeled a singular point. 

Another way of presenting this definition of singular point is to write our 
homogeneous differential equation as 

/' + P(x)y' + Q(x)y = 0. (8.31) 

Now, if the functions P(x) and Q(x) remain finite at x = x 0 , point x = x 0 is an 
ordinary point. However, if either P(x) or Q(x) (or both) diverges as x-*x 0 , 
point x 0 is a singular point. 

Using Eq. 8.31, we may distinguish between two kinds of singular points. 

1. If either P{x) or Q(x) diverges as x -* x 0 but (x — x 0 ) 

P(x) and (x — x 0 ) 2 Q(x) remain finite as x x 0 , then 
x = x 0 is called a regular or nonessential singular 
point. 

2. If P(x) diverges faster than l/(x — x 0 ), so that (x — x 0 ) 

P(x) goes to infinity as x -► x 0 , or Q(x) diverges faster 
than l/(x — x 0 ) 2 so that (x — x 0 ) 2 Q(x) goes to in- 
finity as x x 0 , then point x = x 0 is labeled an ir- 
regular or essential singularity. 


These definitions hold for all finite values of x 0 . The analysis of point x -* oo 
is similar to the treatment of functions of a complex variable (Section 6.6). We 
set x = 1/z, substitute into the differential equation, and then let z -> 0. By 
changing variables in the derivatives, we have 


dy(x) __ dy(z l ) dz _ 1 dy(z x ) _ 2 dy(z x ) 

dx dz dx x 2 dz dz 


d^yjx) 

dx 2 


d_ 

dz 


dy(x) dz 
dx dx 


(~z 2 ) 


dy(z ’) _ 2 d^yjz x ) ' 
dz dz 2 


^. 3 dy(z _ 4 d 2 y(z l ) 

dz + dz 2 ’ 


Using these results, we transform Eq. 8.31 into 


(8.32) 


(8.33) 


Z 4g + [2z 3 _ z 2p (z -i)]g + Q{z -i )y = o. 


(8.34) 


The behavior at x = oo (z = 0) then depends on the behavior of the new 


^his prime notation, y' — dyfdx, was introduced by Lagrange in the late 
eighteenth century as an abbreviation for Leibnitz’s more explicit but more 
cumbersome dy/dx. 
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coefficients 


2 z - Piz- 1 ) 

.2 


and 


Qjz- 1 ) 


as z 0. If these two expressions remain finite, point x = oo is an ordinary 
point. If they diverge no more rapidly than 1/z and 1/z 2 , respectively, point 
x — oo is a regular singular point, otherwise an irregular singular point (an 
essential singularity). 


EXAMPLE 8.4.1 


Bessel’s equation is 


x 2 /' + xy' + (x 2 — n 2 )y = 0. 


Comparing it with Eq. 8.31, we have 



Q(x) = l-~ 


(8.35) 


which shows that point x = 0 is a regular singularity. By inspection we see that 
there are no other singular points in the finite range. As x -» oo (z -> 0), from 
Eq. 8.34 we have the coefficients 


2z 


and 


n 2 z 2 


Since the latter expression diverges as z 4 , point x = oo is an irregular or essential 
singularity. 

The ordinary differential equations of Section 8.3, plus two others, the hyper- 
geometric and the confluent hypergeometric, have singular points, as shown in 
Table 8.3. 

It will be seen that the first three equations in the preceding tabulation, hyper- 
geometric, Legendre, and Chebyshev, all have three regular singular points. The 
hypergeometric equation with regular singularities at 0, 1, and oo is taken as the 
standard, the canonical form. The solutions of the other two may then be 
expressed in terms of its solutions, the hypergeometric functions. This is done 
in Chapter 13. 

In a similar manner, the confluent hypergeometric equation is taken as the 
canonical form of a linear second-order differential equation with one regular 
and one irregular singular point. 


EXERCISES 

8 . 4.1 Show that Legendre’s equation has regular singularities at x = — 1, 1, and oo. 

8 . 4.2 Show that Leguerre’s equation, like the Bessel equation, has a regular singularity 
at x = 0 and an irregular singularity at x = oo. 
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TABLE 8.3 


Equation 

Regular 

Singularity 

x — 

Irregular 

Singularity 

x = 

1. 

Hypergeometric 

x(x — l)y" 4- [(1 + a + b)x — c] / + aby = 0. 

0, 1, oo 

— 

2. 

Legendre* 

(1 - x 2 )y" - 2xy' + /(/ + l)y = 0. 

— 1, 1, 00 

— 

3. 

Chebyshev 

(1 — x 2 )y " — xy' -f n 2 y = 0. 

-1, 1, 00 

— 

4. 

Confluent hypergeometric 
xy" + (c — x)y' — ay = 0. 

0 

oo 

5. 

Bessel 

x 2 y" + xy ' + (x 2 — n 2 )y = 0. 

0 

oo 

6. 

Laguerre* 

xy" + (1 — x)y + ay = 0. 

0 

oo 

7. 

Simple harmonic oscillator 
y" + oj 2 y = 0. 

— 

oo 

8. 

Hermite 

y" — 2 xy' + 2a y = 0. 

— 

oo 


*The associated equations have the same singular points. 


8.4.3 Show that the substitution 


x-^- — ± a =-h b = l+ 1, I 

2 

converts the hypergeometric equation into Legendre’s equation. 


8.5 SERIES SOLUTIONS — FROBENIUS' METHOD 

In this section we develop a method of obtaining one solution of the linear, 
second-order, homogeneous differential equation. The method, a series expan- 
sion, will always work, provided the point of expansion is no worse than a 
regular singular point. In physics this very gentle condition is almost always 
satisfied. 

A linear , second-order , homogeneous differential equation may be put in the 
form 


Y? + P(x)& + Q(x)y = 0. (8.36) 

The equation is homogeneous because each term contains y(x) or a derivative; 
linear because each y, dy/dx , or d 2 y/dx 2 appears as the first power — and no 
products. In this section we develop (at least) one solution of Eq. 8.36. In Section 
8.6 we develop a second, independent solution and prove that no third, inde- 
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pendent solution exists. Therefore the most general solution of Eq. 8.36 may 
be written as 


y(x) = + c 2 y 2 (x). (8.37) 

Our physical problem may lead to a nonhomogeneous , linear, second-order 
differential equation 

0 + + QWy = p (4 (8-38) 

The function on the right, F(x ), represents a source (such as electrostatic charge) 
or a driving force (as in a driven oscillator). Specific solutions of this nonhomo- 
geneous equation are touched on in Exercise 8.6.25. They are explored in some 
detail, using Green’s function techniques, in Sections 8.7, 16.5, and 16.6, and 
with a Laplace transform technique in Section 15.11. Calling this solution y p , 
we may add to it any solution of the corresponding homogeneous equation 
(Eq. 8.36). Hence the most general solution of Eq. 8.38 is 

y(x) = c^^x) + c 2 y 2 (x) + y p (x). (8.39) 

The constants c x and c 2 will eventually be fixed by boundary conditions. 

For the present, we assume that F(x ) = 0, that our differential equation is 
homogeneous. We shall attempt to develop a solution of our linear, second- 
order, homogeneous differential equation, Eq. 8.36, by substituting in a power 
series with undetermined coefficients. Also available as a parameter is the power 
of the lowest nonvanishing term of the series. To illustrate, we apply the method 
to two important differential equations. First, the linear oscillator equation 

+ co 2 y = 0, (8.40) 


with known solutions y = sin cox , cos cox. 
We try 


y(x) = x k (a 0 -F a x x + a 2 x 2 + a 3 x 3 + • • ■ ) 


= I a,x k+ \ 


A=0 


a 0 0, 


(8.41) 


with the exponent k and all the coefficients a x still undetermined. Note that k 
need not be an integer. By differentiating twice, we obtain 

ax A=0 

JT = Z a ^ k + + A - 1)4 +A “ 2 . 

ax A=0 

By substituting into Eq. 8.40, we have 

OO OO 

£ a^k + k)(k + X - l)x k+A ~ 2 + co 2 £ a x x k+ 2 = 0. 

X=0 A=0 


(8.42) 
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From our analysis of the uniqueness of power series (Chapter 5) the coefficients 
of each power of x on the left-hand side of Eq. 8.42 must vanish individually. 

The lowest power of x appearing in Eq. 8.42 is x k " 2 , for A = 0 in the first 
summation. The requirement that the coefficient vanish 1 yields 

a 0 k(k — 1) = 0. 

We had chosen a 0 as the coefficient of the lowest nonvanishing terms of the series 
(Eq. 8.41), hence, by definition, a 0 0. Therefore we have 

k(k - 1) = 0. (8.43) 

This equation, coming from the coefficient of the lowest power of x, we call the 
indicial equation. The indicial equation and its roots are of critical importance to 
our analysis. Clearly, in this example we must require either that k = 0 or k = 1. 

Before considering these two possibilities for /c, we return to Eq. 8.42 and 
demand that the remaining net coefficients, say, the coefficient of x k+J (j > 0), 
vanish. We set A = j + 2 in the first summation and A = j in the second. (They 
are independent summations and A is a dummy index.) This results in 

a j+2(k + j + 2)(k + j + 1) + (o 2 aj = 0 

or 


u j+2 


= ~a 


CO 


\k+j + 2){k+j+\)‘ 


(8.44) 


This is a two-term recurrence relation. 2 Given a } , we may compute a j+2 and then 
a j+ 4 , a j+6 , and so on up as far as desired. The reader will note that for this 
example, if we start with a 09 Eq. 8.44 leads to the even coefficients u 2 , u 4 , and so 
on, and ignores a l9 a 3 , a 5 , and so on. Since a 1 is arbitrary, let us set it equal to 
zero (compare Exercises 8.5.3 and 8.5.4) and then by Eq. 8.44 

a ^ a ^ a ~j — — * * * == 0, 

and all the odd numbered coefficients vanish. Do not worry about the lost 
terms; the object here is to get a solution. The rejected powers of x will actually 
reappear when the second root of the indicial equation is used. 

Returning to Eq. 8.43, our indicial equation, we first try the solution k = 0. 
The recurrence relation (Eq. 8.44) becomes 


which leads to 


co 


; U + 2)0'+l)’ 


(8.45) 


1 Uniqueness of power series, Section 5.7. 

2 The recurrence relation may involve three terms; that is, a J+2 , depending 
on cij and cij - 2 . Equation 13. 12 for the Hermite functions provides an example 
of this behavior. 
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U9 — dn 


1-2 2 ! 


a> 2 co 4 

a * ~ — 02 J?4 = 

co 2 co 6 . 

a 6 = - a *T~r = — zr a O’ and so on - 
5*6 6 ! 


By inspection (and mathematical induction) 




and our solution is 


v(x) -all- + (oJJc)4 - + 

JW*= 0 -«o 1 2! + 4! 6! + 


= U 0 COS COX. 

If we choose the indicial equation root k — 1 (Eq. 8.44), the recurrence 
relation becomes 


J (j + 3)0* + 2)’ 

Substituting in j = 0, 2, 4, successively, we obtain 

_ CO 2 _ CO 2 

a 2 — ~ a °2y^~ ~ 3T a °" 

2 4 

CO , CO 

a 4 = —<*2475 = + yr a °’ 

co 2 co 6 , 

a 6 = -a 4 — - = —jJ a o, and so on. 

Again, by inspection and mathematical induction, 




For this choice, /c = 1, we obtain 




= ^2 I 

CO 


_ (<” x ) 3 , _ (^*) 7 , 
K ’ 3! 5! 7! 
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I II III i v 



FIG. 8.3 

To summarize this approach, we may write Eq. 8.42 schematically as shown 
in Fig. 8.3. From the uniqueness of power series (Section 5.7), the total coefficient 
of each power of x must vanish — all by itself. The requirement that the first 
coefficient (1) vanish leads to the indicial equation, Eq. 8.43. The second coeffi- 
cient is handled by setting a 1 = 0. The vanishing of the coefficient of x k (and 
higher powers, taken one at a time) leads to the recurrence relation Eq. 8.44. 

This series substitution, known as Frobenius’ method, has given us two 
series solutions of the linear oscillator equation. However, there are two points 
about such series solutions that must be strongly emphasized : 

1. The series solution should always be substituted back 
into the differential equation, to see if it works, as a 
precaution against algebraic and logical errors. 

Conversely, if it works, it is a solution. 

2. The acceptability of a series solution depends on its 
convergence (including asymptotic convergence). It 
is quite possible for Frobenius’ method to give a 
series solution that satisfies the original differential 
equation when substituted in the equation but that 
does not converge over the region of interest. 

Legendre’s differential equation illustrates this 
situation. 

Expansion about x c 

Equation 8.41 is an expansion about the origin, x 0 = 0. It is perfectly possible 
to replace Eq. 8.41 with 

y(x) = y a x {x - x 0 ) k+ \ a 0 0. (8.51) 

A=0 

Indeed, for the Legendre, Chebyshev, and hypergeometric equations the choice 
x 0 = 1 has some advantages. The point x 0 should not be chosen at an essential 
singularity — or our Frobenius method will probably fail. The resultant series 
(x 0 an ordinary point or regular singular point) will be valid where it converges. 
You can expect a divergence of some sort when |x — x 0 | = | z s — x 0 |, where z s 
is the closest singularity to x 0 in the complex plane. 

Symmetry of Solutions 

The alert reader will note that we obtained one solution of even symmetry, 
Fi(x) = Li( — x), and one of odd symmetry, y 2 {x) = — y 2 ( — *)' This is not just 
an accident but a direct consequence of the form of the differential equation. 
Writing a general differential equation as 
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X(x)y(x) = 0, (8.52) 

in which 5£(x) is the differential operator, we see that for the linear oscillator 
equation (Eq. 8.40) j£f(x) is even; that is, 

<£(x) = JS?(-x). (8.53) 

Often this is described as even parity. 

Whenever the differential operator has a specific parity or symmetry, either 
even or odd, we may interchange + x and — x, and Eq. 8.52 becomes 

±JSf(x)y( — x) = 0, (8.54) 

+ if J^f(x) is even, — if if (x) is odd. Clearly, if y(x) is a solution of the differential 
equation, y( — x) is also a solution. Then any solution may be resolved into even 
and odd parts, 

y(x) = j[y(x) + y{ x)] + i[>(*) - y(-*)], (8-55) 

the first bracket on the right giving an even solution, the second an odd solution. 

If we refer back to Section 8.4, we can see that Legendre, Chebyshev, Bessel, 
simple harmonic oscillator, and Hermite equations (or differential operators) 
all exhibit this even parity. Solutions of all of them may be presented as series 
of even powers of x and separate series of odd powers of x. The Laguerre 
differential operator has neither even nor odd symmetry; hence its solutions 
cannot be expected to exhibit even or odd parity. Our emphasis on parity stems 
primarily from the importance of parity in quantum mechanics. We find that 
wave functions usually are either even or odd, meaning that they have a definite 
parity. Most interactions (beta decay is the big exception) are also even or odd 
and the result is that parity is conserved. 

Limitations of Series Approach — Bessel's 
Equation 

This attack on the linear oscillator equation was perhaps a bit too easy. 
By substituting the power series (Eq. 8.41) into the differential equation (Eq. 
8.40), we obtained two independent solutions with no trouble at all. 

To get some idea of what can happen we try to solve Bessel’s equation, 

x 2 y" + x/ + (x 2 — n 2 )y = 0, (8.56) 

using y f for dy/dx and y" for d 2 y/dx 2 . Again, assuming a solution of the form 

00 

y(x) = X 

A=0 

we differentiate and substitute into Eq. 8.56. The result is 

oo 

£ a x (k + X)(k + X 

A = 0 


— l)x fc+2 + a x (k + X)x k 


(8.57) 


+ £ a x x 

• A=0 


k + A + 2 


— £ a x n 2 x k+k — 0. 

A = 0 
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By setting X = 0, we get the coefficient of x k , the lowest power of x appearing 
on the left-hand side: 

a 0 \_k(k — l) + /c-n 2 ] =0, (8.58) 

and again a 0 0 by definition. Equation 8.58 therefore yields the indicial 
equation 

k 2 - n 2 = 0 (8.59) 

with solutions k = ±n. 

It is of some interest to examine the coefficient of x k+1 also. Here we obtain 
a x [{k + l)k + k + 1 -n 2 ] -0 


or 


a^{k 4- 1 — n){k -h 1 + n) — 0. (8.60) 

For k = ±n neither k + 1 — n nor k + 1 + n vanishes and we must require 
a x = 0. 3 

Proceeding to the coefficient of x k+j for k = n, we set X — j in the first, second, 
and fourth terms of Eq. 8.57 and X = j — 2 in the third term. By requiring the 
resultant coefficient of x k+i to vanish, we obtain 

O/O +;)(» +./ - 1) + (n +j) - K 2 ] + dj -2 = 0. 

When j is replaced by j + 2, this can be rewritten as 


1 

b+2 ~ aj (j + 2)(2n+j + 2Y 


(8.61) 


which is the desired recurrence relation. Repeated application of this recurrence 
relation leads to 


a 2 = -a 0 


a 4 — a 2 


a 6 = -a 4 


1 


a 0 n: 


2(2n + 2) 2 2 l!(rc+ 1)!’ 


1 


= + 


a n n! 


4(2n + 4) 2 4 2 l(n + 2) ! ’ 


1 


a 0 n\ 


and in general ; 


6(2n + 6) 2 6 3 !(n -f 3)! 

a 2p = (-\y- "°" ! 


, and so on, 


2 2p p\(n + p)!' 

Inserting these coefficients in our assumed series solution, we have 

nix 4 


y(x ) = a 0 x" 


1 - 


nix 2 


+ 


2 2 1 !(n + 1)! 2 4 2!(n + 2)! 


(8.62) 


(8.63) 


k = ±n = — j are exceptions. 
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In summation form 


® w lY«+ 2 i 


n + 2j 


(8.64) 


In Chapter 11 the final summation is identified as the Bessel function J n (x). 
Notice that this solution J n (x ) has either even or odd symmetry 4 as might be 
expected from the form of Bessel’s equation. 

When k——n and n is not an integer, we may generate a second distinct 
series to be labeled J_ n (x). However, when ~n is a negative integer, trouble 
develops. The recurrence relation for the coefficients a ; is still given by Eq. 8.61, 
but with 2 n replaced by — In. Then, when j + 2 = 2 n or j ~ 2 (n — 1), the 
coefficient a j+2 blows up and we have no series solution. This catastrophe can 
be remedied in Eq. 8.64, as it is done in Chapter 11, with the result that 


J-„{x) = ( - 1 ) n J n (x\ n an integer. (8.65) 

The second solution simply reproduces the first. We have failed to construct a 
second independent solution for Bessel’s equation by this series technique when 
n is an integer. 

By substituting in an infinite series, we have obtained two solutions for the 
linear oscillator equation and one for Bessel’s equation (two if n is not an integer). 
To the questions “Can we always do this? Will this method always work?” 
the answer is no, we cannot always do this. This method of series solution will 
not always work. 


Regular and Irregular Singularities 

The success of the series substitution method depends on the roots of the 
indicial equation and the degree of singularity of the coefficients in the differ- 
ential equation. To understand better the effect of the equation coefficients on 
this naive series substitution approach, consider four simple equations: 


-%y = 

(8.66a) 

X 


Ay = 

X 

( 8.66b) 

7 


a A 

(8.66c) 

a 2 n 

? r-o- 

(8.66c/) 


The reader may show easily that for Eq. 8.66a the indicial equation is 


A J„(x) is an even function if n is an even integer, an odd function if n is an odd 
integer. For nonintegral n the * n has no such simple symmetry. 
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k 2 — k — 6 = 0, 

giving k = 3, — 2. Since the equation is homogeneous in x (counting d 2 /dx 2 as 
x' 2 ), there is no recurrence relation; a t = 0 for i > 0. However, we are left with 
two perfectly good solutions, x 3 and x~ 2 . 

Equation 8.66h differs from Eq. 8 .66a by only one power of x, but this sends 
the indicial equation to 

— 6a 0 — 0, 

with no solution at all, for we have agreed that a 0 ^ 0. Our series substitution 
worked for Eq. 8.66a, which had only a regular singularity, but broke down at 
Eq. 8.66h, which has an irregular singular point at the origin. 

Continuing with Eq. 8.66c, we have added a term y'/x. The indicial equation 
is 

k 2 — a 2 — 0, 

but again, there is no recurrence relation. The solutions are y — x", x - ", both 
perfectly acceptable one term series. 

When we change the power of x in the coefficient of / from —1 to — 2, 
Eq. 8.66 d, there is a drastic change in the solution. The indicial equation (with 
only the / term contributing) becomes 

k = 0. 


There is a recurrence relation 


a j + 1 



- j(J - 1) 
7+1 


Unless the parameter a is selected to make the series terminate, we have 


lim 

j-* °° 



lim 

j-* °° 


7(7 - i) 
7 + 1 


— lim = 00 . 

J-^OO j 

Hence our series solution diverges for all x ^ 0. Again, our method worked 
for Eq. 8.66c with a regular singularity but failed when we had the irregular 
singularity of 8 .66d. 


Fuchs's Theorem 

The answer to the basic question when the method of series substitution 
can be expected to work is given by Fuchs’s theorem, when asserts that we can 
always obtain at least one power-series solution, provided we are expanding 
about a point that is an ordinary point or at worst a regular singular point 
If we attempt an expansion about an irregular or essential singularity, our 
method may fail as it did for Eqs. 8.66h and 8.6 6d. Fortunately, the more 
important equations of mathematical physics listed in Section 8.4 have no 
irregular singularities in the finite plane. Further discussion of Fuchs’s theorem 
appears in Section 8.6. 
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From Table 8.3, Section 8.4, infinity is seen to be a singular point for all 
equations considered. As a further illustration of Fuchs’s theorem, Legendre’s 
equation (with infinity as a regular singularity) has a convergent series solution 
in negative powers of the argument (Section 12.10). In contrast, Bessel’s equation 
(with an irregular singularity at infinity) yields asymptotic series (Sections 5.10 
and 11.6). Although extremely useful, these asymptotic solutions are technically 
divergent. 


Summary 

If we are expanding about an ordinary point or at worst about a regular 
singularity, the series substitution approach will yield at least one solution 
(Fuchs’s theorem). 

Whether we get one or two distinct solutions depends on the roots of the 
indicial equation. 

1. If the two roots of the indicial equation are equal, 
we can obtain only one solution by this series sub- 
stitution method. 

2. If the two roots differ by a nonintegral number, two 
independent solutions may be obtained. 

3. If the two roots differ by an integer, the larger of the 
two will yield a solution. 


The smaller may or may not give a solution, depending on the behavior of the 
coefficients. In the linear oscillator equation we obtain two solutions; for Bessel’s 
equation, only one solution. 

The usefulness of the series solution in terms of what is the solution (i.e., 
numbers) depends on the rapidity of convergence of the series and the avail- 
ability of the coefficients. Many, probably most, differential equations will not 
yield nice simple recurrence relations for the coefficients. In general, the available 
series will probably be useful for |x| (or \x — x 0 |) very small. Computers can 
be used to determine additional series coefficients using a language such as 
FORMAC. Often, however, for numerical work a direct numerical integration 
will be preferred — Section 8.8. 


EXERCISES 

8.5.1 Uniqueness theorem. The function y(x) satisfies a second-order, linear, homo- 
geneous differential equation. At x = x 0 , y(x) = y 0 > an d dy/dx — y ' 0 . Show that 
y(x) is unique in that no other solution of this differential equation passes through 
the point (x 0 , y 0 ) with a slope of y ' 0 . 

Hint. Assume a second solution satisfying these conditions and compare the 
Taylor series expansions. 

8.5.2 A series solution of Eq. 8.36 is attempted, expanding about the point x = x 0 . 
If x 0 is an ordinary point show that the indicial equation has roots k = 0,1. 
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8.5.3 


8.5.4 


8.5.5 


8.5.6 


In the development of a series solution of the simple harmonic oscillator equation 
the second series coefficient a x was neglected except to set it equal to zero. From 
the coefficient of the next to the lowest power of x, x k ~\ develop a second indicial 
type equation. 

(a) (SHO equation with k — 0). Show that a 1 may be assigned any finite value 
(including zero). 

(b) (SHO equation with k = 1). Show that a { must be set equal to zero. 

Analyze the series solutions of the following differential equations to see when 
may be set equal to zero without irrevocably losing anything and when a x must 
be set equal to zero. 

(a) Legendre, (b) Chebyshev, (c) Bessel, (d) Hermite. 

ANS. (a) Legendre, (b) Chebyshev, and (d) Hermite: For k = 0, a x may 
be set equal to zero; For k = 1, a x must be set equal to zero, 

(c) Bessel: a i must be set equal to zero (except for k = ±n = 

Solve the Legendre equation 

(1 — x 2 )y" — 2 xy' 4- n(n + l)y = 0 


by direct series substitution. 

(a) Verify that the indicial equation is 

kik - 1) = 0. 


(b) Using k = 0, obtain a series of even powers of x, (a 1 = 0). 


y even — a q 


1 


n(n + J),. 2 , n(n - 2)(n + l)(n + 3)„ 4 , 

“X i X T" * * * 


2! 


4! 


where 


a j+2 — 


j(j + 1) - n(n + 1) 

U + 1)0 + 2) j 


(c) Using k — 1, develop a series of odd powers of x (a x = 0). 



(” ~ + 2) ^3 


+ (n ~ I)(« ~ 3)(w + 2 )(« + 4) + 


where 


+ 2 = 


(j + 1)0’ + 2) — n(n 4-1) 
(j 4- 2)(j + 3) 


(d) Show that both solutions, y even and y odd, diverge for x = ±1 if the series 
continue to infinity. 

(e) Finally, show that by an appropriate choice of n , one series at a time may 
be converted into a polynomial, thereby avoiding the divergence catastrophe. 
In quantum mechanics this restriction of n to integral values corresponds 
to quantization of angular momentum. 

Develop series solutions for Hermite’s differential equation 


(a) 


y" — 2 xy' + 2a y — 0. 


For k = 0 


ANS. k(k — 1) = 0, indicial equation. 


a j+2 ~ 2a j — , 

' U + 1)0' + 2) 


( j even). 


y even 


1 4 


2( — a)x 2 2 2 ( — a)(2 - a)x 4 
2! 4! 
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For k = 1 


(b) 

(c) 


u j+2 


■ 2 a 


j+l~<x 


j (j + 2)(j + 3)’ 


O' even), 


y odd — <Zq 


x + 2(1 ~ a)* 3 + 2 2 (1 - a)(3 - oc)x 5 + 


3! 


5! 


] 


Show that both series solutions are convergent for all x, the ratio of successive 
coefficients behaving, for large index, like the corresponding ratio in the 
expansion of exp(2x 2 ). 

Show that by appropriate choice of a the series solutions may be cut off and 
converted to finite polynomials. (These polynomials, properly normalized, 
become the Hermite polynomials in Section 13.1.) 


8 . 5.7 Laguerre’s differential equation is 

xL"(*) + (1 — x)L'„(x ) + nL'„(x) — 0. 

Develop a series solution selecting the parameter n to make your series a poly- 
nomial. 


8 . 5.8 Solve the Chebyshev equation 

(1 - x 2 )T: - XT' + n 2 T n = 0, 

by series substitution. What restrictions are imposed on n if you demand that 
the series solution converge for x = ± 1 ? 

ANS. The infinite series does con- 
verge for x = ±1. Therefore no 
restriction on n exists (compare 
Exercise 5.2.16). 


8 . 5.9 


Solve 

(1 — x 2 ) £/"(*) — 3x6 r f |(x) + n(n + 2 )U n (x) = 0, 

choosing the root of the indicial equation to obtain a series of odd powers of x. 
Since the series will diverge for x = 1, choose n to convert it into a polynomial. 

k(k - 1) = 0. 


For k ~ l 


u j+2 


(j + 1)0 + 3) — n(n + 2) ^ 
(j + 2 )(j + 3) 


8 . 5.1 0 Obtain a series solution of the hypergeometric equation 

x(x — 1 )y" -h [(1 + a + b)x — c] y' + aby = 0. 

Test your solution for convergence. 

8 . 5.1 1 Obtain two series solutions of the confluent hypergeometric equation 

xy” 4- (c — x)/ — ay = 0. 

Test your solutions for convergence. 

8 . 5.1 2 A quantum mechanical analysis of the Stark effect (parabolic coordinates) leads 
to the differential equation 


d { du 

dtyTz 


i , 


+ (vE£ + «-^--4^> = 0. 
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Here a is a separation constant, E is the total energy, and F is a constant, where 
Fz is the potential energy added to the system by the introduction of an electric 
field. 

Using the larger root of the indicial equation, develop a power series solution 
about £ = 0. Evaluate the first three coefficients in terms of a 0 . 

^2 


Indicial equation 


= 0, 


m 4 1 


2 (m 4- l)(m 4- 2) 4(m 4 2) 


£ 2 + -- 


Note that the perturbation E does not appear until a 3 is included. 

8.5.13 For the special case of no azimuthal dependence, the quantum mechanical 
analysis of the hydrogen molecular ion leads to the equation 


d_ 

dr} 


n 2\ dli 
(! - n )~y 

drj 


4 ocu 4- Pr} 2 u = 0. 


Develop a power-series solution for u(rj). Evaluate the first three nonvanishing 
coefficients in terms of a 0 . 


Indicial equation k(k — 1) = 0, 


«k = i = a o riU + 


■ a 2 . 

—n + 


\2 - «)(12 - a) 
120 


V+ 

20 J 


8.5.14 To a good approximation, the interaction of two nucleons may be described by 
a meson potential 


attractive for A negative. Develop a series solution of the resultant Schrodinger 
wave equation 

h 2 dV 


2m dx 2 


4 (E - FW = 0, 


through the first three nonvanishing coefficients. 

\jj k = i = — E' — 4 • • ■ 

where the prime indicates multiplication by 2m/h 2 . 


8.5.15 


Near the nucleus of a complex atom the potential energy of one electron is given 
by 

P = — (1 + V + b 2 r 2 ), 
r 


where the coefficients b x and b 2 arise from screening effects. For the case of zero 
angular momentum show that the first three terms of the solution of the 
Schrodinger equation have the same form as those of Exercise 8.5.14. By appro- 
priate translation of coefficients or parameters, write out the first three terms in 
a series expansion of the wave function. 


8.5.1 6 If the parameter a 2 in Eq. 8.66 d is equal to 2, Eq. 8.66 d becomes 




From the indicial equation and the recurrence relation derive a solution y — 
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1 4- 2x + 2x 2 . Verify that this is indeed a solution by substituting back into the 
differential equation. 

8 . 5.1 7 The modified Bessel function I 0 (x) satisfies the differential equation 

x 2 ~I 0 {x) + X~l 0 (x) - x 2 I 0 (x) = 0 
ax ax 

From Exercise 7.4.4 the leading term in an asymptotic expansion is found to be 



Assume a series of the form 

I 0 (x) 4={1 + + b 2 x~ 2 +•••}. 

yJ2nx 

Determine the coefficients b x and h 2 . 

ANS. b x = i 

U 2 _ 

0 2 ~ 128 - 

8.5.1 8 The even power-series solution of Legendre’s equation is given by Exercise 8.5.5. 
Take a 0 = 1 and n not an even integer, say, n — 0.5. Calculate the partial sums 
of the series through x 200 , x 400 , x 600 , . . . , x 2000 for x = 0.95(0.01)1.00. Also, 
write out the individual term corresponding to each of these powers. 

Note. This calculation does not constitute proof of convergence at x = 0.99 or 
divergence at x = 1.00, but perhaps you can see the difference in the behavior 
of the sequence of partial sums for these two values of x. 

8.5.19 (a) The odd power-series solution of Hermite’s equation is given by Exercise 

8.5.6. Take a 0 — 1. Evaluate this series for a = 0, x = 1, 2, 3. Cut off your 
calculation after the last term calculated has dropped below the maximum 
term by a factor of 10 6 or more. Set an upper bound to the error made in 
ignoring the remaining terms in the infinite series. 

(b) As a check on the calculation of part (a), show that the Hermite series 
y odd (a = 0) corresponds to joexp(x 2 )dx. 

(c) Calculate this integral for x = 1, 2, 3. 


8.6 A SECOND SOLUTION 

In Section 8.5 a solution of a second-order homogeneous differential equation 
was developed by substituting in a power series. By Fuchs’s theorem this is 
possible, provided the power series is an expansion about an ordinary point 
or a nonessential singularity. 1 There is no guarantee that this approach will 
yield the two independent solutions we expect from a linear second-order 
differential equation. Indeed, the technique gave only one solution for Bessel’s 
equation ( n an integer). In this section we develop two methods of obtaining a 
second independent solution : an integral method and a power series containing 
a logarithmic term. First, however, we consider the question of independence of 
a set of functions. 

^his is why the classification of singularities in Section 8.4 is of vital 
importance. 
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Linear Independence of Solutions 

Given a set of functions, cp x , the criterion for linear dependence is the existence 
of a relation of the form 

ZM>a = 0, (8.67) 

k 

in which not all the coefficients k x are zero. On the other hand, if the only 
solution of Eq. 8.67 is k x = 0 for all 2, the set of functions cp x is said to be linearly 
independent. 

It may be helpful to think of linear dependence of vectors. Consider A, B, and 
C in three-dimensional space with A^xC^O. Then no relation of the form 

aA + bB + cC = 0 (8.68) 

exists. A, B, and C are linearly independent. On the other hand, any fourth 
vector D may be expressed as a linear combination of A, B, and C (see Section 
4.4). We can always write an equation of the form 

D - aA - bB - cC = 0, (8.69) 

and the four vectors are not linearly independent. The three noncoplanar vectors 
A, B and C span our real three-dimensional space. 

If a set of vectors or functions are mutually orthogonal, then they are auto- 
matically linearly independent. Orthogonality implies linear independence. This 
can easily be demonstrated by taking inner products (scalar or dot product for 
vectors, orthogonality integral of Section 9.2 for functions). 

Let us assume that the functions (p x are differentiable as needed. Then, 
differentiating Eq. 8.67 repeatedly, we generate a set of equations 

2>a<^ = 0 (8.70) 

k 

YjkkV’i [ = 0, and so on. (8.71) 

k 

This gives us a set of homogeneous linear equations in which k x are the unknown 
quantities. By Section 4.1 there is a solution k x =f=- 0 only if the determinant of 
the coefficients of the k x s vanishes. This means 


<p 1 

<P2 

<Pn 



<p'l 

(p' 2 

<Pn 

= 0. 

(8.72) 

cpr i) 


• <P { r l) 




This determinant is called the Wronskian. 

1. If the Wronskian is not equal to zero, then Eq. 8.67 
has no solution other than k x = 0. The set of functions 
( p x is therefore independent. 

2. If the Wronskian vanishes at isolated values of the 
argument, this does not necessarily prove linear 
dependence (unless the set of functions has only two 
functions). However, if the Wronskian is zero over 
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the entire range of the variable, the functions cp k 
are linearly dependent over this range 2 (compare 
Exercise 8.5.2 for the simple case of two functions). 


EXAMPLE 8.6.1 Linear Independence 


The solutions of the linear oscillator equation 8.40 are = sin cox, (p 2 = 
cos cox. The Wronskian becomes 


sin cox 
co cos cox 


cos cox 
— co sin cox 


— a) ^ 0. 


These two solutions, cp 1 and q > 2 , are therefore linearly independent. For just two 
functions this means that one is not a multiple of the other, which is obviously 
true in this case. 

You know that 

sin cox = ±(1 — cos 2 cox) 1/2 , 
but this is not a linear relation, of the form of 8.67. 

EXAMPLE 8.6.2 Linear Dependence 


For an illustration of linear dependence, consider the solutions of the one- 
dimensional diffusion equation. We have cp 1 = e x and <p 2 — e ~ x , and we add 
(p 3 = coshx, also a solution. The Wronskian is 


e x e~ x coshx 
e x — e~ x sinhx 
e x e~ x coshx 


The determinant vanishes for all x because the first and third rows are identical. 
Hence e x , e~ x , and cosh x are linearly dependent, and indeed, we have a relation 
of the form of Eq. 8.67 : 

e x + e~ x — 2 coshx = 0 with k x ^ 0, 

A Second Solution 

Returning to our linear, second-order, homogeneous, differential equation 
of the general form 

/ + P(x)y' + Q(x)y = 0, (8.73) 

let y x and y 2 be two independent solutions. Then the Wronskian, by definition, 
is 


2 Compare page 187 of H. Lass, Elements of Pure and Applied Mathematics. 
New York : McGraw-Hill (1957) for proof of this assertion. It is assumed that 
the functions have continuous derivatives and that at least one of the minors 
of the bottom row of Eq. 8.72 (Laplace expansion) does not vanish in 
the interval under consideration. 
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W = y l ? 2 -y\y 2 - 

By differentiating the Wronskian, we obtain 


(8.74) 


W' = y\y ' 2 + ki/a - y'li’i ~ y'1/2 

= yi[-P(x)y' 2 - GWka] - yil-PWy’i - 6(x)y t ] 


= -P(x)(y 1 y’ 2 - Ziyi)- 

The expression in parentheses is just W, the Wronskian, and we have 

W'= — P(x) W. 


If P(x) — 0; that is, 


y" + Q{x)y = 0 , 


(8.75) 


(8.76) 

(8.77) 


the Wronskian 


W = y 1 y , 2 — > ! i >’ 2 = constant (8.78) 

Since our original differential equation is homogeneous, we may multiply the 
solutions y x and y 2 by whatever constants we wish and arrange to have the 
Wronskian equal to unity (or — 1). This case, P(x) = 0, appears more frequently 
than might be expected. The reader will recall that V 2 in cartesian coordinates 
contains no first derivative. Similarly, the radial dependence of V 2 (n/0 in 
spherical polar coordinates lacks a first derivative. Finally, every linear second- 
order differential equation can be transformed into an equation of the form of 
Eq. 8.77 (compare Exercise 8.6.11). 

Let us now assume that we have one solution of Eq. 8.73 by a series sub- 
stitution (or by guessing). We now proceed to develop a second, independent 
solution. Rewriting Eq. 8.76 as 


dW 

W 


— Pdx x , 


we integrate, from x x = a to x t = x to obtain 


In 


W(x) 

W(a) 


or" 


But 


W(x) = W(a) exp 


P(x t )dx u 


-j* P(Xi)dXi 


W(x) = y,V 2 - y) .v 2 




(8.79) 


(8.80) 


3 If P{x x ) remains finite, a < x x < x, W(x) =/= 0 unless W(a) — 0. That is, 
the Wronskian of our two solutions is either identically zero or never zero. 
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By combining Eqs. 8.79 and 8.80, we have 


±(zi\ = 

dx\y. 


W(a ) 


ex P[-j^(*i)^i] 


y\ 


Finally, by integrating Eq. 8.81 from x 2 = b to x 2 = x we get 

, \ t urn \ rexp[-j^P(x 1 )dx 1 ] 
yi(x) = ydx)W(a) \ r J , , 12 -dx 2 . 

I [yi(x 2 )T 


(8.81) 


(8.82) 


Here a and b are arbitrary constants and a term y^x) y 2 (b)/y\{b) has been 
dropped, for it leads to nothing new. Since W(a\ the Wronskian evaluated at 
x = a, is a constant and our solutions for the homogeneous differential equation 
always contain an unknown normalizing factor, we set W(a) = 1 and write 


yi(x) = y iM 


*exp[ — $* 2 P(x 1 )dx 1 ’] 

TyjxJF 


dx 2 . 


(8.83) 


Note that the lower limits x l = a and x 2 = b have been omitted. If they are 
retained, they simply make a contribution equal to a constant times the known 
first solution, hence add nothing new. 

If we have the important special case of P(x) = 0, Eq. 8.83 reduces to 


y2 M = y iW 


dx 2 

l>l(*2)] 2 ’ 


(8.84) 


This means that by using either Eq. 8.83 or 8.84 we can take one known solution 
and by integrating can generate a second independent solution of Eq. 8.73. This 
technique is used in Section 12.10 to generate a second solution of Legendre’s 
differential equation. 


EXAMPLE 8.6.3 A Second Solution for the Linear Oscillator Equation 


From d 2 y/dx 2 + y = 0 with P(x) = 0 let one solution be y 1 — sinx. By 
applying Eq. 8.84, we obtain 


y 2 (x) = sinx 


I 


dx 2 
sin 2 x 2 


— sinx( — cotx) = — cosx, 


which is clearly independent (not a linear multiple) of sin x. 


Series Form of the Second Solution 
Further insight into the nature of the second solution of our differential 
equation may be obtained by the following sequence of operations: 

1. Express P(x) and Q{x) in Eq. 8.73 as 

P(x) = X PiX\ Q(x)= X qjX j . (8.85) 

i=- 1 j=~ 2 

The lower limits of the summations are selected to 
create the strongest possible regular singularity (at 
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the origin). These conditions just satisfy Fuchs’s 
theorem and thus help us gain a better understanding 
of Fuchs’s theorem. 

2. Develop the first few terms of a power-series solution, 
as in Section 8.5. 

3. Using this solution as y l9 obtain a second series type 
solution, y 2 , with Eq. 8.83, integrating term by term. 

Proceeding with step 1, we have 

y" + (p-i*” 1 + Po + Pi x + •*■)/ + (q~ 2 x ~ 2 + q-i x ~' + • • - )y = o, (8.86) 

in which point x = 0 is at worst a regular singular point. If p_ x — = q~ 2 — 0, 

it reduces to an ordinary point. Substituting 

oo 

J> = l¥ W 

2=0 

(step 2), we obtain 

00 oo oo 

1 (k + m + A - 1 )a x x k+l ~ 2 + X Pi x l X (k + X)a k x k+k - 1 

x-o l — l x-o (8.87) 

00 00 

+ E 9jx J E a x xk+k = °- 

j= —2 2 = 0 

Assuming that ^ 0, q„ 2 ^ 0, our indicial equation is 

k(k — 1) + p^ik + q~2 ~ 0, 

which sets the net coefficient of x k ~ 2 equal to zero. This reduces to 

k 2 + (p_, - l)k + q- 2 = 0. (8.88) 

We denote the two roots of this indicial equation by k = a and k = a — n, where 
ft is zero or a positive integer. (If n is not an integer, we expect two independent 
series solutions by the methods of Section 8.5 and there is no problem.) Then 

(k - a)(fc - a + n) = 0, (8.89) 


or 

k 2 4- (rc — 2a) fc T a(a — n) = 0, 

and equating coefficients of k in Eqs. 8.88 and 8.89, we have 

— 1 = n — 2a. (8.90) 

The known series solution corresponding to the larger root k = a may be 
written as 

00 

y i = *“ E a * xX - 

2=0 

Substituting this series solution into Eq. 8.83 (step 3), we are faced with 
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yi(x) = y,M 


ex P(~ Ja 2 Z” -1 Pi X ‘l dx ^ 


xj a ‘ 


: (lr= 0 a 


A * i f 


dx 9 


(8.91) 


where the solutions y x and y 2 have been normalized so that the Wronskian, 
W(a) = 1. Tackling the exponential factor first, we have 


Z Pix[dx t = p_ l \nx 2 + £ 


Pk 


%k+l 


+ 1 

X 2 


+ j (<*)• 


(8.92) 


Hence 


exp 


T,Pi x i dx 

i 

= exp[— /(a)]xJ p -‘ exp ( - £ 
= exp[-/(a)]xj p 


Pk „k- fl 
*2 


k=ok + 1 

| _ £ P k 


(8.93) 


4- 1 


I 


2! U =0 /c + 1 


Pk „k+l 

X 2 


+ 


This final series expansion of the exponential is certainly convergent if the 
original expansion of the coefficient P(x) was convergent. 

The denominator in Eq. 8.69 may be handled by writing 


x 


2a 

2 




= X 


-2a 


z>. 


F 

l x 2' 


A=0 


(8.94) 


Neglecting constant factors that will be picked up anyway by the requirement 
that W(a) = 1, we obtain 


yi(x) = jq(x) 



dx 2 - 


(8.95) 


By Eq. 8.90 



(8.96) 


and we have assumed here that n is an integer. Substituting this result into 
Eq. 8.95, we obtain 


y 2 (x) = }h(x) 


(c 0 x 2 “ 1 + c t J 


+ c 2 x 2 +1 + ■ ■ ■ + c„x 1 + • ■ )dx 2 - (8.97) 


The integration indicated in Eq. 8.97 leads to a coefficient of y x (x) consisting of 
two parts: 

1. A power series starting with x“”. 

2. A logarithm term from the integration of x _1 (when 
2 — n). This term always appears when n is an integer 
unless c n fortuitously happens to vanish. 4 


For parity considerations, lnx is taken to be In \x\, even. 
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EXAMPLE 8.6.4 A Second Solution of Bessel’s Equation 


From Bessel’s equation, Eq. 8.56 (divided by x 2 to agree with Eq. 8.73), we 
have 

P(x) ~ x~ x Q(x) = 1 for the case n = 0. 

Hence = 1, q 0 = 1; all other p - s and qj s vanish. The Bessel indicial equation 
is 

k 2 — 0 

(Eq. 8.59) with n = 0). Hence we verify Eqs. 8.88 to 8.90 with n and a = 0. 

Our first solution is available from Eq. 8.64. Relabeling it to agree with 
Chapter 11 (and using a 0 — 1), we obtain 5 

yAx) = J 0 (x) = + 0(x 6 ). (8.98 a) 


Now, substituting all this into Eq. 8.83, we have the specific case corresponding 
to Eq. 8.91: 


y 2 (x) = Jq(x) 


C x exp[ — f* 2 x, 1 dx,] 

[l-xf/4 + 4/64 - 


(8.986) 


From the numerator of the integrand 


exp 




exp[ — lnx 2 ] 


*2 


This corresponds to the x 2 p ~ 1 in Eq. 8.93. From the denominator of the in- 
tegrand, using a binomial expansion, we obtain 


1 


+ 4' 


2 x\ 5x2 


4 64 

Corresponding to Eq. 8.95, we have 

r i 

y 2 (x) = J 0 (x) 


5y 4 

= 1+^ + ^ + 

2 32 


‘ xj 5x| 
2 32 


dx 7 


= J 0 (x)<;inx + j + ~ + 


(8.98c) 


Let us check this result. From Eq. 1 1.63, which gives the standard form of the 
second solution, 


N 0 (x) = -[lnx - In 2 + y]J 0 (x) + - . 

71 71 4 


2 f x 2 3x 4 


128 


+ 


(8.98d) 


5 The capital O (order of) as written here means terms proportional to x” 

and possibly higher powers of x. 
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Two points arise: (1) Since Bessel’s equation is homogeneous, we may multiply 
y 2 (x) by any constant. To match N 0 (x\ we multiply our y 2 (x) by 2/n. (2) To our 
second solution (2/n)y 2 (x\ we may add any constant multiple of the first solu- 
tion. Again, to match N 0 (x) we add 

-[-In 2 + y]J 0 (x), 

71 

where y is the usual Euler- Mascheroni constant (Section 5.2). 6 Our new, 
modified second solution is 

, x 2 n i o , it < \ , 2 7 , vfx 2 5x 4 } (8.98e) 

yz(x) = -[lnx - In 2 + y]J 0 (x) + -.VxW— + — +■•• V. 

7i 7i 14 128 j 

Now the comparison with N 0 (x) becomes a simple multiplication of J 0 (x) from 

Eq. 8.9 Sa and the curly bracket of Eq. 8.98c. The multiplication checks — through 
terms of order x 2 and x 4 , which is all we carried. Our second solution from Eqs. 
8.83 and 8.91 agrees with the standard second solution, the Neumann function, 
N 0 (x). 

From the preceding analysis, the second solution of Eq. 8.73, y 2 (x), ma y be 
written as 


y 2 {x) = >'!(x)lnx + X d ) x ' +a , (8.98 f) 

j~ ~n 

the first solution times lnx and another power series, this one starting with 
x a_ ", which means that we may look for a logarithmic term when the indicial 
equation of Section 8.5 gives only one series solution. With the form of the 
second solution specified by Eq. 8.98 f we can substitute Eq. 8.98 f into the 
original differential equation and determine the coefficients d } exactly as in 
Section 8.5. It may be worth noting that no series expansion of In x is needed. In 
the substitution In x will drop out ; its derivatives will survive. 

The second solution will usually diverge at the origin because of the logarith- 
mic factor and the negative powers of x in the series. For this reason y 2 (x) is often 
referred to as the irregular solution. The first series solution, y x (x), which usually 
converges at the origin, is called the regular solution. The question of behavior 
at the origin is discussed in more detail in Chapters 11 and 12 in which we take 
up Bessel functions, modified Bessel functions, and Legendre functions. 

Summary 

These two sections (together with the exercises) provide a complete solution 
of our linear, homogeneous, second-order differential equation — assuming 
that the point of expansion is no worse than a regular singularity. At least one 
solution can always be obtained by series substitution (Section 8.5). A second, 
linearly independent solution can be constructed by the Wronskian double 


6 The Neumann function N 0 is defined as it is in order to achieve convenient 
asymptotic properties, Section 11.6. 
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integral, Eq. 8.83. This is all there are: no third, linearly independent solution 
exists (compare Exercise 8.6.10). 

The nonhomogeneous , linear, second-order differential equation will have an 
additional solution: the particular solution. This particular solution may be 
obtained by the method of variation of parameters, Exercise 8.6.25, or by tech- 
niques such as Green’s functions, Section 8.7. 


EXERCISES 


8.6.1 You know that the three unit vectors i, j, and k are mutually perpendicular 
(orthogonal). Show that i, j, and k are linearly independent. Specifically, show 
that no relation of the form of Eq. 8.67 exists for i, j, and k. 


8.6.2 


The criterion for the linear independence of three vectors A, B, and C is that the 
equation 

aA -f- hB 4- cC — 0 


(analogous to Eq. 8.67) has no solution other than the trivial a = b — c = 0. Using 
components A = (A 1 ,A 2 ,A 3 ), and so on, set up the determinant criterion for the 
existence or nonexistence of a nontrivial solution for the coefficients a, h, and c. 
Show that your criterion is equivalent to the scalar product A • B x C. 


8.6.3 


Using the Wronski determinant, show that the set of functions 
. . . , N) 1 is linearly independent. 


v» 

1 , >= 1 , 2 , 

n\ 


8.6.4 If the Wronskian of two functions y 1 and y 2 is identically zero, show by direct 
integration that 


v i = cy 2 ; 

that is, y x and y 2 are dependent. Assume the functions have continuous deriva- 
tives and that at least one of the functions does not vanish in the interval under 
consideration. 


8.6.5 The Wronskian of two functions is found to be zero at x = x 0 . Show that this 
Wronskian vanishes for all x and that the functions are linearly dependent. 

8.6.6 The three functions sinx, e x , and e~ x are linearly independent. No one function 
can be written as a linear combination of the other two. Show that the Wronskian 
of sinx, e x , and e~ x vanishes but only at isolated points. 

ANS. W =4 sinx, 

W = 0 for x = ±nn n = 0, 1, 2, ... . 

8.6.7 Consider two functions cp 1 = x and cp 2 = |x| = xsgnx (Fig. 8.4). The function 
sgnx is just the sign of x. Since (p\ = 1 and (p 2 = sgnx, W{(p x ,(p 2 ) = 0 for any 
interval including [ — 1, 4- I]. Does the vanishing of the Wronskian over [ — 1, 4-1] 
prove that q> x and cp 2 are linearly dependent? Clearly, they are not. What is 
wrong? 

8.6.8 Explain that linear independence does not mean the absence of any dependence. 
Illustrate your argument with coshx and e x . 

8.6.9 Legendre’s differential equation 

(1 — x 2 )y" — 2xy' + n(n 4- \)y = 0 
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has a regular solution P n (x) and an irregular solution Q n (x). Show that the 
Wronskian of P n and Q n is given by 

P n (x)Q'„(x) - p;(x)Q„(x) = 

with A n independent of x. 

8 . 6.10 Show, by means of the Wronskian, that a linear, second-order, homogeneous, 
differential equation of the form 

y"(x) + P(x)y'(x) + Q(x)y(x) = 0 

cannot have three independent solutions. (Assume a third solution and show 
that the Wronskian vanishes for all x.) 


8 . 6.1 1 Transform our linear, second-order, differential equation 

/' + P(*)y' + QWy = 0 

by the substitution 

y = z exp -|j* P(t)dt 

and show that the resulting differential equation for z is 

z" + q(x)z ~ 0, 


q(x) = Q(x) - iP'lx) - \P 2 (x). 

Note. This substitution can be derived by the technique of Exercise 8.6.24. 

8 . 6.12 Use the result of Exercise 8.6.11 to show that the replacement of cp(r) by r(p(r) 
may be expected to eliminate the first derivative from the Laplacian in spherical 
polar coordinates. See also Exercise 2.5.18(b). 

8 . 6.1 3 By direct differentiation and substitution show that 


y 2 (x) = yi(x) 


exp[-f P(t)dt] 


satisfies 
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y'i(x) + P(x)y' 2 (x) + Q(x)y 2 (x) = 0. 
Note. The Leibnitz formula for the derivative of an integral is 


d C hM 

~ f(x,a)dx 


g(<*) 


rhi<x) 

Jff(a) 


h{a) df(x,a) 
da 


dx 


+ mot), a]^ - /[>(«), a]-^. 


8 . 6.14 In the equation 


y\ (x) satisfies 


yiix) = v,(x) 


'exp[ — J'P(t) dt] 

Ms)] 2 


ds 


y\ + P(x)y' I + Q{x)y, = 0. 

The function y 2 (x) is a linearly independent second solution of the same equation. 
Show that the inclusion of lower limits on the two integrals leads to nothing new; 
that is, it throws in only overall factors and/or a multiple of the known solution 
y t (x). 


8 . 6.1 5 Given that one solution of 


R" + -R' - - m r R 
r r z 


is R = r w , show that Eq. 8.83 predicts a second solution, R = r m . 

8 . 6.16 Using y^x) = ^ =0 (— l)"x 2 " +1 /(2n + 1)! as a solution of the linear oscillator 
equation, follow the analysis culminating in Eq. 8.98/ and show that c x = 0 so 
that the second solution does not, in this case, contain a logarithmic term. 


8 . 6.17 Show that when n is not an integer the second solution of Bessel’s equation, 
obtained from Eq. 8.83, does not contain a logarithmic term. 

8 . 6 . 1 8 (a) One solution of Hermite’s differential equation 

y" — 2xy' T 2ay = 0 

for a = 0 is y x (x) = 1. Find a second solution y 2 (x), using Eq. 8.83. Show 
that your second solution is equivalent to y odd (Exercise 8.5.6). 

(b) Find a second solution for a = 1, where yj(x) — x, using Eq. 8.83. Show that 
your second solution is equivalent to y evcn (Exercise 8.5.6). 

8 . 6.1 9 One solution of Laguerre’s differential equation 

xy" + (1 — x)y' + ny — 0 

for n = 0 is y^x) = 1. Using Eq. 8.83, develop a second, linearly independent 
solution. Exhibit the logarithmic term explicitly. 


8 . 6.20 For Laguerre’s equation with n — 0 


(a) 

(b) 

(c) 


y 2 M : 


J> 


Write y 2 (x) as a logarithm plus a power series. 

Verify that the integral form of y 2 (x), previously given, is a solution of 
Laguerre’s equation ( n — 0) by direct . differentiation of the integral and 
substitution into the differential equation. 

Verify that the series form of y 2 (x), part (a), is a solution by differentiating 
the series and substituting back into Laguerre’s equation. 
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8 . 6.21 One solution of the Chebyshev equation 

(1 — x 2 )y " — xy' 4- n 2 y = 0 

for n = 0 is y r — 1. 

(a) Using Eq. 8.83, develop a second, linearly independent solution. 

(b) Find a second solution by direct integration of the Chebyshev equation. 
Hint. Let v = y' and integrate. Compare your result with the second solution given 
in Section 13.3. 

ANS. (a) y 2 = sin' 1 x. 

(b) The second solution, V n (x ), is not defined for n ~ 0. 


8 . 6.22 One solution of the Chebyshev equation 

(1 - x 2 )y" ~ xy' + n 2 y = 0 

for n — 1 is y t (x) — x. Set up the Wronskian double integral solution and derive 
a second solution, y 2 (x). 

ANS . y 2 =-( l-x 2 ) 1/2 

8 . 6.23 The radial Schrodinger wave equation has the form 

+ = £ y(r). 

2m dr 2 mr J 

The potential energy V(r) may be expanded about the origin as 

V(r) = Li + b 0 + b t r + ■ ■ ■ . 
r 

(a) Show that there is one (regular) solution starting with r m . 

(b) From Eq. 8.84 show that the irregular solution diverges at the origin as r~ l . 


8 . 6.24 Show that if a second solution, y 2 , is assumed to have the form y 2 (x) = y t {x)f(x\ 
substitution back into the original equation 

y f 2 + P(x)/ 2 + Q(x)y 2 = 0 

leads to 

’ c exp[-J s P(t)dt] 


fix) 

in agreement with Eq. 8.83. 


C x 


[y.W] 2 


ds 


8 . 6.25 If our linear, second-order differential equation is nonhomogeneous, that is, of 
the form of Eq. 8.38, the most general solution is 

y(x) = y,(x) + y 2 (x) + y p (x). 

(y 1 and y 2 are solutions of the homogeneous equation.) 

Show that 


y P (x) = y 2 (x) 


yi(s}F(s)ds 

^{y,(s),y 2 (s)} 



y 2 (s)F(s)ds 

^{yi(s),y 2 (s)}’ 


with W\ )'j (.s), _v’ 2 (.s) } the Wronskian of y, (s) and >’ 2 (s). 

Hint. As in Exercise 8.6.24, let y p (x) = y\ (x)v{x) and develop a first-order differen- 
tial equation for v'(x). 


y" + 


l-ot 2 

4x 2 


y = 0 


8 . 6.26 (a) Show that 
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has two solutions: 


yi(x) = a 0 x {l+a)/2 
y 2 (x) = a 0 x^ 2 . 

(b) For a = 0 the two linearly independent solutions of part (a) reduce to 
y 10 = a 0 x 1/2 . Using Eq. 8.84 derive a second solution 

y 20 ( x ) = a 0 x i/2 lnx 

Verify that y 20 is indeed a solution. 

(c) Show that the second solution from part (b) may be obtained as a limiting 
case from the two solutions of part (a): 

y 20 (x) = lim 

\ a J 


8.7 NONHOMOGENEOUS EQUATION — GREEN'S 
FUNCTION 

The series substitution of Section 8.5 and the Wronskian double integral of 
Section 8.6 provide the most general solution of the homogeneous , linear, second- 
order differential equation. The specific solution, y p , linearly dependent on the 
source term ( F(x ) of Eq. 8.78) may be cranked out by the variation of parameters 
method. Exercise 8.6.25. In this section we turn to a different method of solution 
— Green’s functions. 

For a brief introduction to Green’s function method, as applied to the solu- 
tion of a nonhomogeneous partial differential equation, it is helpful to use the 
electrostatic analog. In the presence of charges the electrostatic potential ^ 
satisfies Poisson’s nonhomogeneous equation (compare Section 1.14) 

\ 2 ij/ — — (mks units) (8.99) 

£ o 

and Laplace’s homogeneous equation, 

V 2 ^ = 0 ? (8.100) 

in the absence of electric charge (p = 0). If the charges are point charges q h we 
know that the solution is 




1 y 

4m 0 Y r t 9 


( 8 . 101 ) 


a superposition of single-point charge solutions obtained from Coulomb’s law 
for the force between two point charges q x and q 2 , 


f = £i « a£a (8.102) 

4n& 0 r 

By replacement of the discrete point charges with a smeared out distributed 
charge, charge density p, Eq. 8.101 becomes 
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Hr = 0 ) = 


1 

47T£ 0 ^ 


p( r) 


di 


r 


(8.103) 


or, for the potential at r = away from the origin and the charge at r = r 2 , 


Hti) 


1 

47T£ 0 



(8.104) 


Dirac Delta Function 

A formal derivation and generalization of this result is facilitated by using 
<5(x), the Dirac delta function, as in Section 1.15. For the one-dimensional case 
the Dirac delta function is often defined by the following properties: 


and 


S(x) = 0, x^O, (8.105) 

f*O0 

3{x)dx = 1, (8.106) 

J — QO 


f(x)S(x)dx =/(0). 


(8.107) 


Here it is assumed that /(x) is continuous at x = 0. 

From these defining equations <3(x) must be an infinitely high, infinitely thin 
spike — as in the description of an impulsive force (Section 15.9) or charge 
density for a point charge. 1 The problem is that no such function exists in the 
usual sense of function. It is possible to approximate the delta function by a 
variety of functions, Eqs. 8.108 to 8.111 and Figs. 8.5 to 8.8: 




0, 


n , 


x < — 


2 n 


1 1 

< x < — 

2 n In 


<5„(x) 

<5„(x) 


0, 


X > 


2 n 


rexp ( — n 2 x 2 ) 
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2 v 2 


smnx = _L| eix , dt 
nx 2n 


(8.108) 


(8.109) 

( 8 . 110 ) 
( 8 . 111 ) 


x The delta function is frequently invoked to describe very short range forces 
such as nuclear forces. It also appears in the normalization of continuum wave 
functions of quantum mechanics. Compare Eq. 15.21 d for plane wave 
eigenfunctions. 
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FIG. 8.5 ^-sequence function 



These approximations have varying degrees of usefulness. Equation 8.108 is 
useful in providing a simple derivation of the integral property, Eq. 8.107. 
Equation 8.109 is convenient to differentiate. Its derivatives lead to the Hermite 
polynomials, Eq. 13.7. Equation 8.111 is particularly useful in Fourier analysis 
and in its applications to quantum mechanics. In the theory of Fourier series, 
Eq. 8.111 often often appears (modified) as the Dirichlet kernel: 


= 


1 sin[(n + ^)x] 
27t sin(^x) 


( 8 . 112 ) 


In using these approximations in Eq. 8.107 and later, we assume that f(x) is well 
behaved — it offers no problems at large x. 

For most physical purposes such approximations are quite adequate. From a 
mathematical point of view the situation is still unsatisfactory : The limits 


lim <5„(x) 

«-> 00 


do not exist. 



FIG. 8.8 ^-sequence function 


A way out of this difficulty is provided by the theory of distributions. Recog- 
nizing that Eq. 8.107 is the fundamental property, we focus our attention on it 
rather than on (5(x) itself Equations 8.108 to 8.111 with n = 1, 2, 3, . . . may be 
interpreted as sequences of normalized functions: 


d n (x)dx = 1. 


The sequence of integrals has the limit 


lim 

n~* oo 


S n (x)f(x)dx =/( 0). 


(8.113) 


(8.114) 
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Note carefully that Eq. 8.114 is the limit of a sequence of integrals. Again, the 
limit of <5„(x), n — ► oo, does not exist. (The limits for all four forms of c>„(x) diverge 
at x — 0). 

We may treat <5(x) consistently in the form 


6(x)f(x)dx = lim S n (x)f(x)dx. 


(8.115) 


<5(x) is labeled a distribution (not a function) defined by the sequences <5„(x) as 
indicated in Eq. 8,1 15. We might emphasize that the integral on the left-hand side 
of Eq. 8.115 is not a Riemann integral. 2 It is a limit. 

This distribution S(x) is only one of an infinity of possible distributions, but 
it is the one we are interested in because of Eq. 8.107. 

We use <5(x) frequently and call it the Dirac delta function 3 — for historical 
reasons. Remember that it is not really a function. It is essentially a shorthand 
notation, defined implicitly as the limit of integrals of a sequence, 3 n (x\ accord- 
ing to Eq. 8.115. It should be understood that our Diract delta function has 
significance only as part of an integrand and never as an end result. In this spirit 
the Dirac delta function is often regarded as an operator, a linear operator: 
S(x — x 0 ) operates on f(x) and yields /(x 0 ). 


if(x 0 )/(x) = 


"QO 

<Hx - x 0 )f(x)dx =j\x 0 ). 

J — 00 


(8.116) 


It may also be classified as a linear mapping or simply as a generalized function. 
Shifting our singularity to the point x = x', we write the Dirac delta function as 
<S(x — x'). Equation 8.107 becomes 


/(x)< 5(x — x')dx = f(x'\ 


(8.117) 


As a description of a singularity at x = x', the Dirac delta function may be 
written as <5(x — x') or as £>(x' — x). Going to three dimensions and using 
spherical polar coordinates, we obtain 


*2tt i * co err™ 

S(r)r 2 dr sin 6 d9d(p = 3(x)S(y)3(z)dxdydz — 1. (8.118) 

•j 0 *-0 Jo */*)«/ — oo 

This corresponds to a singularity (or source) at the origin. Again, if our source is 
at r = r x , Eq. 8.118 becomes 


3{x 2 — *i)r\ dr 2 sin 0 2 d0 2 dcp 2 — 1. 


(8.119) 


2 It can be treated as a Stieltjes integral if desired. <5(x) dx is replaced by du(x ), 
where u(x ) is the Heaviside step function (compare Exercise 8.7.13). 

3 Dirac introduced the delta function to quantum mechanics. Actually the 
delta function can be traced back to Kirchhoff, 1882. For further details see 
M. Jammer, The Conceptual Development of Quantum Mechanics . McGraw- 
Hill, New York (1966). 
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As already mentioned, 

<5(r 2 - i-j) = <5(1-! - r 2 ). 


Poisson's Equation — Green's Function Solution 


( 8 . 120 ) 


Returning to our electrostatic problem, we use i j/ as the potential correspond- 
ing to the given distribution of charge and therefore satisfying Poisson's equation 

V 2 I (8.121) 

£ o 

whereas a function G, which we label Green’s function, is required to satisfy 
Poisson’s equation with a point source at the point defined by r 2 : 

\ 2 G = -Sir, — r 2 ). (8.122) 

Physically, then, G is the potential at r, corresponding to a unit source (e 0 ) at r 2 . 
By Green’s theorem (Section 1.11) 


(ij/\ 2 G - G\ 2 il/)dT 2 = (i p\G - G\\jj)-d<*. 


(8.123) 


Assuming that the integrand falls off faster than r 2 , we may simplify our 
problem by taking the volume so large that the surface integral vanishes, leaving 


ij/\ 2 GdT 2 = 


G\ 2 i)/dT 2 


or by substituting in Eqs. 8.121 and 8.122, we have 
il/{r 2 )S{r 1 - r 2 )dx 2 = - 


G(ri,r 2 )p(r 2 ) ^ 

e 0 


(8.124) 


(8.125) 


Integration by employing the defining property of the Dirac delta function 
(Eq. 8.107) produces 


<A(ri) = 


1 


G(r 1 ,r 2 )p(r 2 )dT 2 . 


(8.126) 


Note that we have used Eq. 8.122 to eliminate V 2 G but that the function G 
itself is still unknown. In Section 1.14, Gauss’s law, we found that 


V 2 



o, 

— 4n, 


(8.127) 


0 if the volume did not include the origin and — 4n if the origin were included. 
This result from Section 1.14 may be rewritten as 

=-*'>• ° r (8i2s) 

corresponding to a shift of the electrostatic charge from the origin to the position 
r = r 2 . Here r 12 — \r, — r 2 |, and the Dirac delta function Sir, — r 2 ) vanishes 
unless r, = r 2 . Therefore in a comparison of Eqs. 8.122 and 8.128 the function 
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G (Green’s function) is given by 


G( ri ,r 2 )=-— [ -. (8.129) 

- r 2 \ 

The solution of our differential equation (Poisson’s equation) is 

Mg) = A l - f I ~ — ~, dt 2 (8-130) 

4ti£ 0 J Jr ! - r 2 | 

in complete agreement with Eq. 8.104. Actually ^(rj, Eq. 8.130, is the particular 
solution of Poisson’s equation. We may add solutions of Laplace’s equation 
(compare Eq. 8.39). Such solutions could describe an external field. 

In Sections 16.5 and 16.6 these results will be generalized to the second-order 
linear but nonhomogeneous, differential equation 

&y(Ti) = -/(rj. (8.131) 

The Green’s function is taken to be a solution of 

&G( r 1; r 2 ) = - r 2 ) (8.132) 

(analogous to Eq. 8.122). Then the particular solution y(rj) becomes 


y(ti) 


G{T u T 2 )f(r 2 )dT 2 . 


(8.133) 


(There may also be an integral over a bounding surface depending on the con- 
ditions specified.) 

In summary, Green’s function, often written G(r 1 ,r 2 ) as a reminder of the 
name, is a solution of Eq. 8.122. It enters in an integral solution of our differential 
equation, as in Eq. 8.104. For the simple, but important, electrostatic case we 
obtain Green’s function G(r 1? r 2 ) by Gauss’s law, comparing Eqs. 8.122 and 8.128. 
Finally, from the final solution (Eq. 8.130) it is possible to develop a physical 
interpretation of Green’s function. It occurs as a weighting function or influence 
function that enhances or reduces the effect of the charge element p(r 2 )dr 2 
according to its distance from the field point r A . Green’s function, G(r t , r 2 ), gives 
the effect of a unit point source at r 2 in producing a potential at r l . This is how 
it was introduced in Eq. 8.122; this is how it appears in Eq. 8.130. 


Symmetry of Greens Function 

An important property of Green’s function is the symmetry of its two vari- 
ables, that is, 

G(r 1 ,r 2 ) = G(r 2 ,r 1 ). (8.134) 

Although this is obvious in the electrostatic case just considered, it can be 
proved under much more general conditions. In place of Eq. 8.122, let us 
require that G(r, rj satisfy 4 


4 Equation 8.135 is a three-dimensional version of the self-adjoint eigenvalue 
equation, Eq. 9.4. 
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V-[>(r)VG(r,rj)] + ^(rJG^rj) = -<5(r - r,), (8.135) 

corresponding to a mathematical point source at r = r x . Here the functions 
p{ r) and q( r) are well behaved but otherwise arbitrary functions of r. Green’s 
function, G(r,r 2 ), satisfies the same equation but the subscript 1 is replaced by 
subscript 2. 

V-[p(r)VG(r,r 2 )] + Aq(r)G(r,r 2 ) = -<5(r - r 2 ). (8.136) 

Then G(r, r 2 ) is a sort of potential at r, created by a unit point source at r 2 . We 
multiply the equation for G(r, r t ) by G(r, r 2 ) and the equation for G(r, r 2 ) by 
G(r, r^ and then subtract the two: 

G(r, r 2 )V • [p(r) V G(r,rj)] - GforJV • [p(r) V G(r, r 2 )] 

= -G(r,r 2 )<5(r - r x ) + G(r,r,)^(r - r 2 ). (8.137) 

The first term in Eq. 8.137, 

G(r,r 2 )V.[p(r)VG(r,r,)] 

may be replaced by 

V-[G(r,r 2 )p(r)VG(r,r,)] - VG(r,r 2 )-p(r)VG(r, ri ). 

A similar transformation is carried out on the second term. Then integrating 
over whatever volume is involved and using Green’s theorem, we obtain a 
surface integral : 

[G(r,r 2 )p(r) VG(r,rj) - G(r,rj)p(r) VG(r,r 2 )] -da = -G(r,,r 2 ) + G^.rU- 

JS 

(8.138) 

The terms on the right-hand side appear when we use the Dirac delta functions 
and carry out the volume integration. Under the requirement that Green’s 
functions, G(r,r : ) and G(r,r 2 ), have the same values over the surface S and that 
their normal derivatives have the same values over the surfaces S, or that the 
Green’s functions vanish (Dirichlet boundary conditions, Section 9.1) 5 over the 
surface S, the surface integral vanishes and 

G(r 1 ,r 2 ) = G(r 2 ,r 1 X (8.139) 

which shows that Green’s function is symmetric. If the eigenfunctions are com- 
plex, boundary conditions corresponding to Eqs. 9.20 to 9.22 are appropriate. 
Equation 8.139 becomes 

G(r 1 ,r 2 ) = G*(r 2 ,r 1 ). (8.140) 

Note that this symmetry property holds for Green’s function in every equation 
in the form of Eq. 8.135. In Chapter 9 we shall call equations in this form self- 


5 Any attempt to demand that the normal derivatives vanish at the surface 
(Neumann’s conditions, Section 9.1) leads to trouble with Gauss’s Law. It is 
like demanding that J E • dxs = 0 when you know perfectly well that there is 
some electric charge inside the surface. 
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adjoint. The symmetry is the basis of various reciprocity theorems; the effect of 
a charge at r 2 on the potential at r A is the same as the effect of a charge at r x 
on the potential at r 2 . 

This use of Green’s functions is a powerful technique for solving many of the 
more difficult problems of mathematical physics. We return to it when we take 
up integral equations in Chapter 16. 


EXERCISES 

8.7.1 Let 


&n(x) ■ 


o. 


x < 


~2n 


1 1 

2 n <X< 2 n 

1 

2 n <X ’ 


Show that 


r 

•J-. 


Hm ! f(x)d„(x)dx = /(0), 


assuming that /(x) is continuous at x = 0. 

8.7.2 Verify that the sequence <5 n (x), based on the function 

f0, x<0 

" “ \ne~ nx , x > 0, 

is a delta sequence (satisfying Eq. 8.114). Note that the singularity is at +0, the 
positive side of the origin. 

Hint. Replace the upper limit (oo) by c/n, where c is large but finite and use the 
mean value theorem of integral calculus. 


8.7.3 For 


3,(x) - - 


(Eq. 8.110), show that 


n 1 + n 2 x 2 


d n {x)dx — 1. 


8.7.4 Demonstrate that S n — sinnx/nx is a delta distribution by showing that 


lim 

n~*o o _ 


xSinwx , ™ 

f(x) dx = f( 0). 

, TLX 


Assume that fix) is continuous at x = 0 and vanishes as x -► ± oo. 

Hint. Replace x by yjn and take lim n -* oo before integrating. The needed integral 
is evaluated in Sections 7.2 and 15.7. 

8.7.5 Fejer’s method of summing series i$ associated with the function 


K(t) = ^ 

2nn 


sin (nt/2) 
sin (t/2) 
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Show that S„(t ) is a delta distribution in the sense that 


i r°° 

lim 7- /« 

2 nn 

CO 


sin (nt/2) 
sin (t/2) 


^=/(0). 


8 . 7.6 Prove that 


S[a(x - x x )] = -<5(x — x x ). 
a 


Note. If S[a(x — x t )] is considered even relative to x lf the relation holds for 
negative a and 1/a may be replaced by l/|a|. 

8 . 7.7 Show that 

<S[(x - xj{x - * 2 )] = [<5(x - Xj) + <5(x - x 2 )]/|x, - x 2 |. 

Hint. Try using Exercise 8.7.6. 

8 . 7.8 Using the Gauss error curve delta sequence ( S„ ), show that 

x-j~S(x) = -d(x), 
ax 

treating d(x) and its derivative as in Eq. 8.115. 

8 . 7.9 Show that 

^»oo 

S'(x)f(x)dx = -/'( 0). 


"ex 


Here we assume that f'(x) is continuous at x = 0. 
8 . 7.10 Prove that 

\df(x) 


dx 


S(x - x 0 ). 


<?(/(*)) = 

where x 0 is chosen so that /(x 0 ) = 0. 

Hint. Note that S(f)df = 3(x)dx. 

8.7.1 1 Show that in spherical polar coordinates (r, cos 6 , (p ) the delta function <5(r x — r 2 ) 
becomes 

\s(r 1 — r 2 )S(cosd l — cos 0 2 )S((p l — <p 2 ). 
r \ 

Generalize this to the curvilinear coordinates (q t , q 2 , q 3 ) of Section 2.1 with scale 
factors h 1 ,h 2 , and h 3 . 

8.7.12 A rigorous development of Fourier transforms (Sneddon, Fourier Transforms ) 6 
includes as a theorem the relations 


lim - f 

a-oo n 


/(M + X)— i/x 


f(u + 0) + f(u - 0), 

x t < 0 < x 2 

f(u + 0), 

Xi = 0 < x 2 

f(u - 0), 

Xl <0 = x 2 

0 

x i < x 2 < 0 


U S Aj ^ A 2 

Verify these results using the Dirac delta function. 


Sneddon, I. N., Fourier Transforms. New York: McGraw-Hill (1951). 
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FIG. 8.9 |[1 + tanh nx\ and the Heaviside unit step function 
8.7.1 3 (a) If we define a sequence <5„(x) = n/(2cosh 2 nx), show that 

6„{x)dx — 1, independent of n . 

(b) Continuing this analysis, show that* 

<5„(x)dx = ^[1 4- tanhnx] = u n (x) 


t 


and 


x < 0, 
x > 0. 


lim u n (x) ■- 

Tl-KX) 

This is the Heaviside unit step function. 

8.7.1 4 Show that the unit step function u(x) may be represented by 

. v 1 ± 1 pf 0 ixt dt 

u(x) = ~ + —P\ e ixl —, 

2 2m J_ w t 

where P means Cauchy principal value (Section 7.2). 

8.7.15 As a variation of Eq. 8. 1 1 1, take 


i f 00 

«„(*) = y e* 

231 J-oo 


to-Ull-dt. 


Show that this reduces to (n/n)- 1/(1 + rc 2 x 2 ), Eq. 8.110, and that 

S n {x)dx — 1. 

Note. In terms of integral transforms, the initial equation here may be interpreted 
as either a Fourier exponential transform of e'^ ln or a Laplace transform of e ixt . 


8.7.16 Show that 


G(r l9 r 2 ) = - 


„t*|ri-r 2 | 


47r|r 1 - r 2 | 

is a Green’s function satisfying the differential equation 
(V? + k 2 )G(ti,r 2 ) = — <5(r x - r 2 ). 


* Many other symbols are used for this function. This is the AMS-55 notation : 
u for unit. 
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This involves two parts: 

(a) Show that G(r 1 ? r 2 ) satisfies the homogeneous differential equation away 
from r x = r 2 

(b) Show that 

{ 0 r & V 

_l’ f 2 eW 


8.8 NUMERICAL SOLUTIONS 

The analytic solutions and approximate solutions to differential equations in 
this chapter and in succeeding chapters may suffice to solve the problem at hand 
— particularly if there is some symmetry present. The power-series solutions 
show how the solution behaves at small values of x. The asymptotic solutions 
(compare Sections 1 1.6 and 12. 10) show how the solution behaves at large values 
of x. These limiting cases and also the possible resemblance of our differential 
equation to the standard forms with known solutions (Chapters 11 to 13) are 
invaluable in helping us gain an understanding of the general behavior of our 
solution. 

However, the usual situation is that we have a different equation, perhaps a 
different potential in the Schrodinger wave equation, and we want a reasonably 
exact solution. So we turn to numerical techniques. 

First-Order Differential Equations 

The differential equation involves a continuity of points. The independent 
variable x is continuous. The (unknown) dependent variable y(x) is assumed 
continuous. The concept of differentiation demands continuity. Our numerical 
processes replace these continua by discrete sets. We consider x at 

x 0 , x 0 -f h, x 0 + 2/z, x 0 + 3 h, and so on, 

where h is some small interval. The smaller h is, the better the approximation 
is — in principle. But if h is made too small, the demands on machine time will be 
excessive, and accuracy may actually decline because of accumulated round-off 
errors. We refer to the successive discrete values of x as x„, x„ +1 , and so on, and 
the corresponding values of y(x) as y(x rt ) = y n . If x 0 and y 0 are given, the problem 
is to find y u then to find y 2 , and so on. 


Taylor Series Solution 

Consider the ordinary (possibly nonlinear) first-order differential equation 

~-y(x) = f(x,y) (8.141) 

dx 

with the initial condition y(x 0 ) — y 0 . In principle, a step-by-step solution of the 
first-order equation, Eq. 8.141, may be developed to any degree of accuracy by 
a Taylor expansion 
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y(x 0 + h) = y(x 0 ) + hy’(x 0 ) + —y"(x 0 ) + 


+ -y n \xo) + 


Yl ! 


(8.142) 


(assuming the derivatives exist and the series is convergent). The initial value 
y(xo) is known and /(x 0 ) is given as /(x 0 , y 0 ). In principle, the higher derivatives 
may be obtained by differentiating y'(x) = f(x, y). In practice, this differentiation 
may be tedious. Now, however, this differentiation can be done by computer, 
using languages such as FORMAC. For equations of the form encountered in 
this chapter a large computer has no trouble generating and evaluating ten or 
more derivatives. 

The Taylor series solution is a form of analytic continuation, Section 6.5. 

If the right-hand side of Eq. 8.142 is truncated after two terms, we have 


yi = y 0 + hy' 0 


= yo + hf(x 0 ,y 0 l 


(8.143) 


neglecting the terms of order /z 2 . Eq. 8.143 is often called the Euler solution. 
Clearly, it is subject to serious error with the neglect of terms of order /t 2 . 


Runge-Kutta Method 

The Runge-Kutta method is a refinement of this, with an error of order h 5 . 
The relevant formulas are 

y„+i =y n + [k 0 + 2k, + 2k 2 + fe 3 ]/ 6, (8.144) 

where 


k 0 = hf(x„,y n ), 

k i = hf(x„ + jh, y„ + }k 0 ), 

(8.145) 

k 2 = ¥(x„ + y„ + iki), 

k 3 = ¥(x« + k, >„ + k 2 ). 

A derivation of these equations appears in Ralston and Wilf 1 (Chapter 9 by 
M. J. Romanelli). 

Equations 8. 144 and 8. 145 define what might be called the classic fourth-order 
Runge-Kutta method (accurate through terms of order h 4 ). This is the form 
followed in IBM’s Scientific Subroutine Package (SSP). Many other Runge- 
Kutta methods exist. Lapidus and Seinfeld (see references) analyze and compare 
other possibilities and recommend a fifth-order form due to Butcher as slightly 
superior to the classic method. 

The form of Eqs. 8.144 and 8.145 is assumed and the parameters adjusted 
to fit a Taylor expansion through h 4 . From this Taylor expansion viewpoint 
the Runge-Kutta method is also an example of analytic continuation. 

For the special case in which dy/dx is a function of x alone [/(x,y) in Eq. 


1 A. Ralston, and H. S. Wilf, eds., Mathematical Methods for Digital Com- 

puters . New York: Wiley (1960) 
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8.141 -> /(x)], the last term in Eq. 8.144 reduces to a Simpson rule numerical 
integration from x n to x B+1 . 

The Runge-Kutta method is stable, meaning that small errors do not get 
amplified. It is self-starting, meaning that we just take the x 0 and y 0 and away 
we go. But it has disadvantages. Four separate calculations of / (x, y ) are required 
at each step. The errors, although of order h 5 per step, are not knowm. One 
checks the numerical solution by cutting h in half and repeating the calculation. 
If the second result argees with the first, then h was small enough. 

Finally, the Runge-Kutta method can be extended to a set of coupled 
first-order equations: 


du r , x 

Tx = f ' {w) 


dv 

dx 


= f 2 (x,u,v\ and so on, 


(8.146) 


with as many dependent variables as desired. Again, Eq. 8.146 may be nonlinear, 
an advantage of the numerical solution. 


Predictor-Corrector Methods 

As an alternate attack on Eq. 8.141, we might estimate or predict a tentative 
value of y n+l by 


y»+i = y n - 1 + 2 hy’n 

= y n - 1 + 2 hf(x n ,y„). 


(8.147) 


This is not quite the same as Eq. 8.143. Rather, it may be interpreted as 


,, _ .Ay = y n +i - y n -i 

n ~ Ax 2h 


(8.148) 


the derivative as a tangent being replaced by a chord. Next we calculate 


y'n+i = f(x n +i,y n +iy (8.149) 

Then to correct for the crudeness of Eq. 8.147, we take 

h _ 

Tn+i — y n + 2^ n+1 + y ' n ^ (8.150) 


Here the finite difference ratio Ay/h is approximated by the average of the two 
derivatives. This technique — a prediction followed by a correction (and iteration 
until agreement is reached) — is the heart of the predictor-corrector method. 
It should be emphasized that the preceding set of equations is intended only to 
illustrate the predictor-corrector method. The accuracy of this set (to order h 3 ) 
is usually inadequate. 

The iteration (substituting y n+1 from Eq. 8.150 back into Eq. 8.149 and 
recycling until y n+1 settles down to some limit) is time-consuming in a computing 
machine operation. Consequently, the iteration is usually replaced by an 
intermediate step (the modifier) between Eqs. 8.147 and 8.149. 
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This modified predictor-corrector method has the major advantage over 
the Runge-Kutta method of requiring only two computations of f(x,y) per 
step, instead of four. Unfortunately, the method as originally developed was 
unstable — small errors (round-off and truncation) tended to propagate and 
become amplified. 

This very serious problem of instability has been overcome in a version of 
the predictor-corrector method devised by Hamming. The formulas (which 
are moderately involved), a partial derivation, and detailed instructions for 
starting the solution are all given by Ralston (Chapter 8 of Ralston and Wilf). 
Hamming’s method is accurate to order h 4 . It is stable for all reasonable values 
of h and provides an estimate of the error. Unlike the Runge-Kutta method, 
it is not self-starting. For example, Eq. 8.147 requires both y n _ 1 and y n . Starting 
values {yo,yi,y 2 ^y^) for the Hamming predictor-corrector method may be 
computed by series solution (power series for small x, asymptotic series for 
large x) or by the Runge-Kutta method. 

The Hamming predictor-corrector method may be extended to cover a set 
of coupled first-order differential equations, that is, Eq. 8.146. 

Second-Order Differential Equations 

Any second-order differential equation 

/'(*) + P(x)y'(x) + g(x)y(x) = F(x), (8.151) 

may be split into two first-order differential equations by writing 

y(x) = z(x\ (8.152) 

and then 

z'(x) + P(x)z(x) + Q(x)y(x) = F(x), (8.153) 

These coupled first-order differential equations may be solved by either the 
Runge-Kutta or Hamming predictor-corrector techniques previously 
described. 

As a final note — a thoughtless turning the crank application of these powerful 
numerical techniques is an invitation to disaster. The solution of a new and 
different differential equation will usually involve a mixture of analysis and 
numerical calculation. There is little point in trying to force a Runge-Kutta 
solution through a singular point where the solution is going to blow up. 

EXERCISES 

8 . 8.1 The Runge-Kutta method, Eq. 8.144, is applied to a first-order differential equation 
dyjdx = f(x). Note that this function f(x ) is independent of y. Show that in this 
special case the Runge-Kutta method reduces to Simpson’s rule for numerical 
quadrature, Appendix A2. 

8 . 8.2 (a) A body falling through a resisting medium is described by 

dv 

jr 9 ~ av 
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(for a retarding force proportional to the velocity). Take the constants to 
be g = 9.80 (meters/sec 2 ) and a = 0.2 (sec -1 ). The initial conditions are t — 0, 
v = 0. Integrate this equation out to t — 20.0 in steps of 0.1 sec. Tabulate the 
value of the velocity for each whole second, t>(1.0), t>(2.0), and so on. If a 
plotting routine is available, plot v(t) versus t. 

(b) Calculate the ratio of u(20.0) to the terminal velocity r(oo). 

Check value . v{10) = 42369 meters/sec. 

ANS. (b) 0.9817. 

8 . 8.3 The differential equation for the population of a radioactive daughter element is 

= A, exp(-V) - >- 2 ^ 2 , 

exp(~2 jt) being the rate of production resulting from the decay of the parent 
element. = 0.10 sec' 1 , X 2 ~ 0.08 sec' 1 . Integrate this differential equation from 
t = 0 out to t = 40 seconds for the initial condition AT 2 (0) = 0. Tabulate and plot 
N 2 (t) vs t. 


8 . 8.4 The time-reversed asteroid depletion equation is 



Solve this equation by using a Runge-Kutta or equivalent subroutine. The initial 
conditions are 


t 0 = 0 (years) 

N 0 — 100 (asteroids) 

k = 0.25 x 10' 11 (years)' 1 (asteroid)' 1 . 

Carry out your solution as far as you can. (There will be trouble as you approach 
t = 5 x 10 9 years.) Tabulate N(t) versus t, with At = 5 x 10 7 years. 

Note. Exercise 8.2.3 (with k replaced by — k) gives the analytic solution. 


8 . 8.5 Integrate Legendre’s differential equation, Exercise 8.5.5, from x — 0 to x = 1 
with the initial conditions j/(0) = l,-/(0) = 0 (even solution). Tabulate y{x ) and 
dy/dx at intervals of 0.05. Take n — 2. 


8 .8.6 The Lane-Emden equation of astrophysics is 


d 2 y + 2 dy 
dx 2 x dx 


+ / = 0. 


Take y(0) = 1, y(0) = 0, and investigate the behavior of y(x) for s — 0, 1, 2, 3, 4, 5, 
and 6. In particular, locate the first zero of y(x). 

Hint. From a power-series solution j/"(0) = — 5 . 

Note. For s = 0, y(x) is a parabola, for s = 1, a spherical Bessel function, Jq{x). 
As s 5, the first zero moves out to 00, and for s > 5, y(x) never crosses the positive 
x-axis. 

ANS. For >-(xJ = 0, x 0 = 2.45( v / 6), 
x 1 = 3. 14(7i), x 2 — 4.35, 
x 3 = 6.90. 


8 . 8.7 As a check on Exercise 8.6.18(a), integrate Hermite’s equation 


d 2 y 

dx 2 


2x^ = 0 

dx 



496 DIFFERENTIAL EQUATIONS 


from x = 0 out to x = 3. The initial conditions are y{ 0) = 0, /(0) = 1. Tabulate 
y(l)? y(2), and y(3). 

ANS. >?(1) = 1.463 
y( 2) - 16.45 
y( 3) = 1445. 


REFERENCES 

Bateman, H., Partial Differential Equations of Mathematical Physics. New York: Dover 
(1944; first edition, 1932). 

A wealth of applications of various partial differential equations in classical physics. 
Excellent examples of the use of different coordinate systems — ellipsoidal, parabo- 
loidal, toroidal coordinates, and so on. 

Davis, P. J. and P. Rabinowitz, Numerical Integration . Waltham, Mass. : Blaisdell (1967). 
This book covers a great deal of material in a relatively easy- to-read form. Appendix 1 
(On the Practical Evaluation of Integrals by M. Abramowitz) is excellent as an overall 
view. 

Hamming, R. W., Numerical Methods for Scientists and Engineers , 2nd ed. New York: 
McGraw-Hill (1973). 

This well-written text discusses a wide variety of numerical methods from zeros of 
functions to the fast Fourier transform. All topics are selected and developed with a 
modern high-speed computer in mind. 

Ince, E. L., Ordinary Differential Equations. New York: Dover (1926). 

The classic work in the theory of ordinary differential equations. 

Lapidus, L., and J. H. Seinfeld, Numerical Solutions of Ordinary Differential Equations. 
New York: Academic Press (1971). 

A detailed and comprehensive discussion of numerical techniques with emphasis on the 
Runge-Kutta and predictor-corrector methods. Recent work on the improvement of 
characteristics such as stability is clearly presented. 

Miller, R. K., and A. N. Michel, Ordinary Differential Equations. New York: Academic 
Press (1982). 

Murphy, G. M., Ordinary Differential Equations and Their Solutions. Princeton, N.J.: 
Van Nostrand (1960). 

A thorough, relatively readable treatment of ordinary differential equations, both 
linear and nonlinear. 

Ralston, A., and H. Wilf, Eds., Mathematical Methods for Digital Computers. New Y ork : 
Wiley (1960). 

Ritger, P. D., and N. J. Rose, Differential Equations with Applications. New York: 
McGraw-Hill (1968). 

Stroud, A. H., Numerical Quadrature and Solution of Ordinary Differential Equations , 
Applied Mathematics Series, Vol. 10. New York: Springer-Verlag (1974). 

A balanced, readable, and very helpful discussion of various methods of integrating 
differential equations. Stroud is familiar with recent work in this field and provides 
numerous current references. 



9 STURM- 
LIOUVILLE 
THEORY- 
ORTHOGONAL 
FUNCTIONS 


In the preceding chapter we developed two linearly independent solutions 
of the second-order linear homogeneous differential equation and proved that 
no third, linearly independent solution existed. In this chapter the emphasis 
shifts from solving the differential equation to developing and understanding 
general properties of the solutions. In Section 9.1 the concepts of self-adjoint 
operator, eigenfunction, eigenvalue, and Hermitian operator are presented. The 
concept of adjoint operator, given first in terms of differential equations is then 
redefined in accordance with usage in quantum mechanics. The vital properties 
of reality of eigenvalues and orthogonality of eigenfunctions are derived in 
Section 9.2. In Section 9.3 we discuss the Gram-Schmidt procedure for system- 
atically constructing sets of orthogonal functions. Finally, the general property 
of the completeness of a set of eigenfunctions is explored in Section 9.4. 


9* 1 SELF-ADJOINT DIFFERENTIAL EQUATIONS 

In Chapter 8 we we studied, classified, and solved linear, second-order, differ- 
ential equations corresponding to linear, second-order, differential operators 
of the general form 

&u(x) = p 0 (x)^u(x) + Pl (x)£u(x) + p 2 {x)u{x). (9.1) 

The functions p 0 (x ), pfx), and p 2 (x) are not to be confused with the constants 
p t of Section 8.6. Reference to Eq. 8.73 shows that P(x) = Pi{x)/p 0 (x) and 
Q(x) = p 2 (x)/p 0 (x). 

These coefficients, p 0 (x\ pfx), and p 2 (x) are real functions of x and over the 
region of interest, a < x < b, the first 2 — i derivatives of /?,(x) are continuous. 
Further, p 0 {x) does not vanish for a < x < b. Now, the zeros of p 0 (x) are singular 
points (Section 8.4), and the preceding statement simply means that we choose 


497 



498 STURM-LIOUVILLE THEORY— ORTHOGONAL FUNCTIONS 


our interval [a,£>] so that there are no singular points in the interior of the 
interval. There may be and often are singular points on the boundaries. 

It is convenient in the mathematical theory of differential equations to define 
an adjoint 1 operator f£ by 


^2 a 

2lPo U l - -^ c L p l U ] + Pl U 


= + (2p'o - + (Po - P'l + p 2 )u. 


In a comparison of Eqs. 9.1 and 9.2 the necessary and sufficient condition that 
if = if is that 



(9.3) 

(9-4) 


and the operator if is said to be self-adjoint. Here, for the self-adjoint case, 
p 0 (x) is replaced by p(x) and p 2 {x) by q(x) to avoid unnecessary subscripts. 
The importance of the form of Eq. 9.4 is that we will be able to carry out two 
integrations by parts-Eq. 9.21 and following. 2 

In a survey of the differential equations introduced in Section 8.3, Legendre’s 
equation and the linear oscillator equation are self-adjoint, but others, such as 
the Laguerre and Hermite equations, are not, However, the theory of linear, 
second-order, self-adjoint differential equations is perfectly general because we 
can always transform, the non-self-adjoint operator into the required self-adjoint 
form. Consider Eq. 9.1 with p' 0 f p l . If we multiply if by 3 




we obtain 


j The adjoint operator bears a somewhat forced relationship to the adjoint 
matrix. A better justification for the nomenclature is found in a comparison 
of the self-adjoint operator (plus appropriate boundary conditions) with the 
self-adjoint matrix. The significant properties are developed in Section 9.2. 
Because of these properties, we are interested in self-adjoint operators. 

2 The full importance of the self-adjoint form (plus boundary conditions) will 
become apparent in Section 9.2. In addition, self-adjoint forms will be required 
for developing integral equations and Green’s functions in Section 16.5. 

3 If we multiply <£ by f{x)/p 0 (x) and then demand that 

/'W = — , 

Po 

so that the new operator will be self-adjoint, we obtain 
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1 

Po(x) 


exp 


f 


p i(0 
Po(t) 


dt 


, ( £u(x) = — 
ax 


exp 


'sMdt 

Po(t) 


du(x)\ 


dx J 


+ 


Pz(*) 

Po(x) 


•exp 


Pjit) 

Po(t) 


dt 


u , 


(9.5) 


which is clearly self-adjoint. Notice the p 0 (x) in the denominator. This is why 
we require p 0 (x) =/= 0, a < x < b. In the following development we assume that 
has been put into self-adjoint form. 


Eigenfunctions, Eigenvalues 

From separation of variables or directly from a physical problem we have a 
linear second-order differential equation of the form 

ifu(x) + X w(x)u(x) — 0. (9.6) 

Here X is a constant and w(x) is a known function of x, called a density or 
weighting function. The significance of these labels will appear in subsequent 
sections. We require that w(x) > 0, except possibly at isolated points at which 
w(x) = 0. For a given choice of the parameter X , a function u A (x\ which satisfies 
Eq. 9.6 and the imposed boundary conditions , is called an eigenfunction corre- 
sponding to X. The constant X is then called an eigenvalue. There is no guarantee 
that an eigenfunction u A (x) will exist for any arbitrary choice of the parameter X. 
Indeed, the requirement that there be an eigenfunction often restricts the 
acceptable values of X to a discrete set. Examples of this for the Legendre, 
Her mite, and Chebyshev equations appear in the exercises of Section 8.5. 
Here we have one mathematical approach to the process of quantization in 
quantum mechanics. 

The major example of Eq. 9.6 in physics is the Schrodinger wave equation 

H{j/(x) = E\j/(x\ 

where the differential operator £ becomes the Hamiltonian H and the eigen- 
value {~X) becomes the total energy E of the system. The eigenfunction i j/(x) 
is usually called a wave function. A variational derivation of this Schrodinger 
equation appears in Section 17.7. 


EXAMPLE 9. 1. 1 Legendre’s Equation 


Legendre’s equation is given by 

(1 — x 2 )y" — 2 xy f + n(n + l)y = 0. (9.7) 

From Eqs. 9.1 and 9.6 

Po(x) = 1 - x 2 = P w(x) = 1, 

p x (x) = — 2x = p' X ~ n(n + 1), (9.8) 

Pi(x) = 0 = <j. 

The reader will recall that our series solutions of Legendre’s equation (Section 
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Equation 

P( x) 

<?(*) 

X 

w(x) 

Legendre 

1 -X 2 

0 

1(1 + 1) 

1 

Shifted Legendre 

x(l — x) 

0 

/(/ + 1) 

1 

Associated Legendre 

1 -x 2 

— m 2 /(l -x 2 ) 

/(/ + 1) 

1 

Chebyshev I 

(1 — X 2 ) 112 

0 

n 2 

(i - x 2 r i/2 

Shifted Chebyshev I 

0(1 - X )] 1 ' 2 

0 

n 2 

0(l-x)]-' /2 

Chebyshev II 

(1 - x 2 f 2 

0 

n(n + 2) 

(1 - x 2 ) I/2 

Ultraspherical (Gegenbauer) 

(1 - x 2 f +i ' 2 

0 

n(n + 2a) 

(1 - x 2 )*- 1/2 

Bessel* 

X 

n 2 

X 

a 2 

X 

Laguerre 

xe~ x 

0 

a 

e~ x 

Associated Laguerre 

x k+l e~ x 

0 

a — k 

x k e~ x 

Hermite 

— 

e 

0 

2 a 

-x 2 

e 

Simple harmonic oscillator 1 " 

1 

0 

n 2 

1 


* Orthogonality of Bessel functions is rather special. Compare Section 1 1.2 
for details. A second type of orthogonality is developed in Section 1 1.7. 
f This will form the basis for Chapter 14, Fourier series. 


8.5) 4 diverged unless n was restricted to one of the integers. This represents a 
quantization of the eigenvalue X. 

When the equations of Chapter 8 are transformed into self-adjoint form, 
we find the following values of the coefficients and parameters (Table 9.1). 

The coefficient p(x) is the coefficient of the second derivative of the eigen- 
function and hopefully can be identified with no difficulty. The eigenvalue X is 
the parameter of function of the parameter that is available [in a term of the 
form 2w(x)j/(x)]. Any x dependence apart from the eigenfunction becomes the 
weighting function w(x). If there is another term containing the eigenfunction 
(not the derivatives), the coefficient of the eigenfunction in this additional term 
is identified as q(x). If no such term is present, q(x) is simply zero. 

EXAMPLE 9.1.2 Deuteron 

Further insight into the concepts of eigenfunction and eigenvalue may be 
provided by an extremely simple model of the deuteron. The neutron-proton 
nuclear interaction is represented by a square well potential: V = V 0 < 0 for 
0 < r < a, V — 0 for r > a. The Schrodinger wave equation is 

-^V 2 ^ + = £(//. (9.9) 

With \ jj — we may write u{r) = n//(r), and using Exercise 2.5.18, the wave 
equation becomes 


Compare also Sections 5.2 and 12.10. 
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with 


d 2 u 

dr 


2 4" k\ u — 0, 


(9.10) 


k\ = 2 -~(E -V 0 )>0 


(9.11) 


for the interior range, 0 < r < a. Here M is the reduced mass of the neutron- 
proton system. For a < r < oo, we have 


d 2 u 1 2 


dr 


2 — k\u = 0, 


(9-12) 


with 


k 


2 

2 


2 ME 


> 0 . 


(9.13) 


From the boundary condition that if/ remain finite, m( 0) = 0 and 

u\{r) ~ sin k t r, 0 < r < a. (9.14) 

In the range outside the potential well, we have a linear combination of the two 
exponentials, 


u 2 (r) = Aexpk 2 r + Rexp( — k 2 r\ a < r < oo. (9.15) 

Continuity of particle density and current demand that u x (a) = u 2 (a) and that 
U|(a) — u' 2 (a). These joining conditions give 


sin/c x o = A expk 2 a + jBexp( — k 2 a), 
k x cos k^a = k 2 A exp k 2 a — k 2 Bexp( — k 2 a). 


(9-16) 


The condition that we actually have one proton-neutron combination is that 
jil/*\j/dT = 1. This constraint can be met if we impose a boundary condition 
that i j/(r) remain finite as r oo. And this, in turn, means that A = 0. Dividing 
the preceding pair of equations (to cancel B\ we obtain 

tan/cja = ~ £ ~ ° ’ ^ 9 ' 17 * 

a transcendental equation for the energy E with only certain discrete solutions. 
If E is such that Eq. 9.17 can be satisfied, our solutions u l (r) and u 2 (r) can 
satisfy the boundary conditions. If Eq. 9.17 is not satisfied, no acceptable solution 
exists . The values of E for which Eq. 9.17 is satisfied are the eigenvalues; the 
corresponding functions u x and u 2 (or \jj) are the eigenfunctions. For the actual 
deuteron problem there is one (and only one) negative value of E satisfying 
Eq. 9.17, that is, the deuteron has one and only one bound state. 

Now, what happens if E does not satisfy Eq. 9.17, if E is not an eigenvalue? 
In graphical form, imagine that E and therefore k x are varied slightly. 

For E — E x < E 0 , k x is reduced, and si nk x a has not turned down as much. 
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FIG. 9. 1 A deuteron eigenfunction 

The joining conditions, Eq. 9.16, require A > 0 and the wave function goes to 
+ oo, exponentially. For E = E 2 > E 0 ,k l is larger, sin k l a peaks sooner and is 
descending more rapidly at r — a. The joining conditions demand A < 0, and 
the wave function goes to — oo, exponentially. Only for E = £ 0 , an eigenvalue, 
will the wave function have the required negative exponential asymptotic 
behavior. 

Boundary Conditions 

In the foregoing definition of eigenfunction, it was noted that the eigen- 
function u x (x) was required to satisfy certain imposed boundary conditions. 
These boundary conditions may take three forms : 

1. Cauchy boundary conditions. The value of a function 
and normal derivative specified on the boundary. In 
electrostatics this would mean <p, the potential, and 
E n the normal components of the electric field. 

2. Dirichlet boundary conditions. The value of a func- 
tion specified on the boundary. 

3. Neumann boundary conditions. The normal deriva- 
tive (normal gradient) of a function specified on the 
boundary. In the electrostatic case this would be E n 
and therefore a, the surface charge density. 

A summary of the relation of these three types of boundary condition to the 
three types of two-dimensional partial differential equation is given in Table 
9.2. For extended discussions of these partial differential equations the reader 
may consult Sommerfeld, Chapter 2, or Morse and Feshbach, Chapter 6 
(see General References). 

Parts of Table 9.2 are simply a matter of maintaining internal consistency, 
of common sense. For instance, for Poisson’s equation with a closed surface, 
Dirichlet conditions lead to a unique, stable solution. Neumann conditions, 
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TABLE 9.2 


Boundary 

conditions 

Type of partial differential equation 


Elliptic 

Hyperbolic 

Parabolic 


Laplace, Poisson 

Wave equation 

Diffusion equation 


in (x, y) 

in (x, t) 

in (x,t) 

Cauchy 

Open surface 

Unphysical results 
(instability) 

U nique , stable 
solution 

Too restrictive 

Closed surface 

Too restrictive 

Too restrictive 

Too restrictive 

Dirichlet 

Open surface 

Insufficient 

Insufficient 

Unique, stable 
solution in 
one direction 

Closed surface 

Unique, stable 
solution 

Solution not 
unique 

Too restrictive 

Neumann 

Open surface 

Insufficient 

Insufficient 

Unique, stable 
solution in 
one direction 

Closed surface 

Unique , stable 
solution 

Solution not 
unique 

Too restrictive 


independent of the Dirichlet conditions, likewise lead to a unique stable solution 
independent of the Dirichlet solution. Therefore Cauchy boundary conditions 
(meaning Dirichlet plus Neumann) could lead to an inconsistency. 

The term boundary conditions includes as a special case the concept of 
initial conditions. For instance, specifying the initial position x 0 and the initial 
velocity v 0 in some dynamical problem would correspond to the Cauchy 
boundary conditions. The only difference in the present usage of boundary 
conditions in these one-dimensional problems is that we are going to apply 
the conditions on both ends of the allowed range of the variable. 

Usually the form of the differential equation or the boundary conditions 
on the solutions will guarantee that at the ends of our interval (that is, at the 
boundary) the following products will vanish: 


P(x)v*(x) 


du(x ) 
dx 


= 0 . 

x—a 


and 


(y.i8) 


p(x)«*(x) 


du(x) 

dx 


- 0 . 

x=b 


Here u(x) and v(x) are solutions of the particular differential equation (Eq. 9.6) 
being considered. We can, however, work with a somewhat less restrictive set 
of boundary conditions, 


v*pu'\ x = a = v*pu'\ x=b , 


(9.19) 


in which u(x) and v(x) are solutions of the differential equation corresponding 
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to the same or to different eigenvalues. Equation 9.19 might well be satisfied 
if we were dealing with a periodic physical system such as a crystal lattice. 

Equations 9.18 and 9.19 are written in terms of v*, complex conjugate. 
When the solutions are real, v = v* and the asterisk may be ignored. However, 
in Fourier exponential expansions and in quantum mechanics the functions 
will be complex and the complex conjugate will be needed. 

These properties (Eq. 9.18 or 9.19) are so important for the concept of 
Hermitian operator (which follows) and the consequences (Section 9.2) that 
literally the interval (a, b) will be chosen to ensure that Eq. 9.18 or 9.19 are 
satisfied. If our solutions are polynomials, the coefficient p{x) will determine the 
range of integration. Note that p(x) also determines the singular points of the 
differential equation, Section 8.3. For nonpolynomial solutions, for example, 
sinnx, cos nx; (p — 1), the range of integration is determined by properties of 
the solutions — as in Example 9.1.3. 

EXAMPLE 9.1.3 Choice of Integration Interval, [a, /?] 

For S£ = d 2 /dx 2 a possible eigenvalue equation is 

^y(x) + n 2 y{x) = 0, (9.20) 

with eigenfunctions 

u n = cos nx 
v m — sin mx. 

Equation 9.19 becomes 

— nsinmxsinwx|* = 0 
or 

m cos mx cos nx | h a = 0, 

interchanging u n and v m . Since sin mx and cos nx are periodic with period 2 n 
(for n and m integral), Eq. 9.19 is clearly satisfied if a = x 0 and b = x 0 + 2n. 

The interval is chosen so that the boundary conditions (Eq. 9.19, etc.) are 
satisfied. For this case (Fourier series) the usual choices are x 0 = 0 leading to 
(0, 2n) and x 0 = — n leading to ( — Here and throughout the following 
several chapters the integration interval is chosen so that the boundary conditions 
(Eq. 9.19) will be satisfied. The interval [ a,b ] and the weighting factor w(x) 
for the most commonly encountered second-order differential equations are 
listed in Table 9.3. 

Hermitian Operators 

We now prove an important property of the combination self-adjoint, 
second-order differential operator (Eq. 9.6), plus solutions u{x) and t)(x) that 
satisfy boundary conditions given by Eq. 9.19. 
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TABLE 9.3 


Equation 

a 

b 

w(x) 

Legendre 

-1 

1 

1 

Shifted Legendre 

0 

1 

1 

Associated Legendre 

-1 

1 

1 

Chebyshev I 

-1 

1 

(1 - x 2 )- 1 ' 2 

Shifted Chebyshev I 

0 

1 

0(1 - x)]“ 1/2 

Chebyshev II 

-1 

1 

(1 - X 2 ) 1 / 2 

Laguerre 

0 

00 


Associated Laguerre 

0 

oo 

x k e~ x 

Hermite 

— oo 

oo 

-x 2 

e 

Simple harmonic oscillator 

0 

2n 

1 


— n 

71 

1 


Note. 1. The orthogonality interval [a, b\ is determined by the boundary 
conditions of Section 9. 1 . 

2. The weighting function is established by putting the differential 
equation in self-adjoint form. 


By integrating v* (complex conjugate) times the second-order self-adjoint 
differential operator if (operating on u) over the range a < x < b, we obtain 


rb 


v*J?udx — 


v*(pu')' dx + 


v*qu dx 


using Eq. 9.4. Integrating by parts, we have 


f 


v*(puj dx = v*pu' 


v*'pu' dx. 


(9.21) 


(9.22) 


The integrated part vanishes on application of the boundary conditions (Eq. 
9.19). Integrating the remaining integral by parts a second time, we have 



v*'pu f dx = —v* f pu 


+ u(pv*')' dx. 


(9.23) 


Again, the integrated part vanishes in an application of Eq. 9.19. A combination 
of Eqs. 9.21 to 9.23 gives us 

f v*J£udx= f u<£v*dx. (9.24) 

J a Ja 

This property, given by Eq. 9.24, is expressed by saying that the operator if is 
Hermitian with respect to the functions u(x) and r(x) which satisfy the boundary 
conditions specified by Eq. 9.19. Note carefully that this Hermitian property 
follows from self-adjointness plus boundary conditions. 


Hermitian Operators in Quantum Mechanics 

The preceding development in this section has focused on the classical 
second-order differential operators of mathematical physics. Generalizing our 
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Hermitian operator theory as required in quantum mechanics, we have an 
extension: The operators need be neither second-order differential operators 
nor real. p x = — ih(d/dx) will be an Hermitian operator. We simply assume 
(as is customary in quantum mechanics) that the wave functions satisfy appro- 
priate boundary conditions: vanishing sufficiently strongly at infinity or having 
periodic behavior (as in a crystal lattice, or unit intensity for waves). The 
operator ££ is called Hermitian if 


Apart from the simple extension to complex quantities, this definition is identical 
with Eq. 9.24. 

The adjoint A 1- of an operator A is defined by 

i/q*AV 2 dx = J (Aij/^ij /2 dx. (9.26) 

This is quite different from our classical, second derivative operator-oriented 
definition, Eq. 9.2. Here the adjoint is defined in terms of the resultant integral, 
with the A f as part of the integrand. Clearly, if A — A f (self-adjoint), then A is 
Hermitian. The converse is not so simple (and not always true), but in quantum 
mechanics the two terms self-adjoint and Hermitian are usually taken to be 
synonymous. (This is also done in matrix analysis, Section 4.5.) 

The expectation value of an operator if is defined as 


<^> 


i jf*$£\\t dx . 


(9.27a) 


In the framework of quantum mechanics <if> corresponds to the result of a 
measurement of the physical quantity represented by if when the physical 
system is in a state described by the wave function ip. If we require if to be 
Hermitian, it is easy to show that <if> is real (as would be expected from a 
measurement in a physical theory). Taking the complex conjugate of Eq. 9.27a, 
we obtain 


<if>* 


\ r ^ dz 


* 


= J \p<£*\p*dx. 

Rearranging the factors in the integrand, we have 

<^>* = | (if m dx. 

Then, applying our definition of Hermitian operator, Eq. 9.25, we get 


<if>* = 


dx = 


(9.27 b) 


or <if > is real. It is worth noting that i jj is not necessarily an eigenfunction 
of if. 
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EXERCISES 


9.1 .1 Show that Laguerre’s equation may be put into self-adjoint form by multiplying 
by e~ x and that w(x) = e~ x is the weighting function. 

9.1.2 Show that the Hermite equation may be put into self-adjoint form by multiplying 
by e~ x and that this gives w(x) = e~ x as the appropriate density function. 

9.1 .3 Show that the Chebyshev equation (type I) may be put into self-adjoint form by 
multiplying by (1 — x 2 )~ 112 and that this gives w(x) = (1 — x 2 )~ 1/2 as the ap- 
propriate density function. 


9.1.4 


Show the following when the linear second-order differential equation is expressed 
in self-adjoint form: 

(a) The Wronskian is equal to a constant divided by the initial coefficient p. 




c 

P(x) 


(b) 


A second solution is given by 

y 2 (x) = Cy t (x ) 


dt 

p|>i(0] r 


9.1 .5 U n (x), the Chebyshev polynomial (type II) satisfies the differential equation 
(1 - x 2 )c;(x) - 3 xU'(x) + n(n + 2)U n (x) = 0. 

(a) Locate the singular points that appear in the finite plane and show whether 
they are regular or irregular. 

(b) Put this equation in self-adjoint form. 

(c) Identify the complete eigenvalue. 

(d) Identify the weighting function. 


9.1.6 


For the very special case X — 0 and q(x) = 0 the self-adjoint eigenvalue equation 
becomes 



du(x) 

dx 


= 0 , 


satisfied by 


du _ 1 
dx p(x) 


Use this to obtain a “second” solution of the following: 

(a) Legendre’s equation, 

(b) Laguerre’s equation, 

(e) Hermite’s equation. 

ANS. (a) « 2 (x) = ^ln[^ 


These second solutions illustrate the divergent 
second solution. 

Note. In all three cases w^x) = 1. 


(b) u 2 (x) 


/ > r a 

« 2 (*o)= e—, 

Jx D 
0 

u 2 (x) ~ e* 1 dt. 

behavior usually found in a 


(c) 


9.1 .7 Given that <£u = 0 and g^u is self-adjoint, show that for the adjoint operator 
if, &{gu) = 0. 
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9.1.8 


9.1.9 


9.1.10 


9.1.11 


9.1.12 


9.1.13 


9.1.14 

9.1.15 


9.1.16 

9.1.17 


For a second-order differential operator 3? that is self-adjoint show that 

C b 

[y 2 ^y\ - yi&y^dx = p(y\y 2 - y t y ' 2 ) IS 

Ja 


Show that if a function ij/ is required to satisfy Laplace’s equation in a finite region 
of space and to satisfy Dirichlet boundary conditions over the entire closed 
bounding surface, then ij/ is unique. 

Hint. One of the forms of Green’s theorem, Section 1.11 will be helpful. 

Consider the solutions of the Legendre, Chebyshev, Hermite, and Laguerre 
equations to be polynomials. Show that the ranges of integration that guarantee 
that the Hermitian operator boundary conditions will be satisfied are 

(a) Legendre [ — 1, 1], 

(b) Chebyshev [ — 1, 1], 

(c) Hermite ( — oo, oo), 

(d) Laguerre [0, oo). 

Within the framework of quantum mechanics (Eqs. 9.25 and following), show 
that the following are Hermitian operators: 

h 

(a) momentum p = —ih\ ~ i — V 

2n 

h 

(b) angular momentum L = — ihr x V = i — r x V 

2n 

Hint. In cartesian form L is a linear combination of noncommuting Hermitian 
operators. 

(a) A is a non-Hermitian operator. In the sense of Eqs. 9.25 and 9.26, show that 

A 4- A f and i(A — A f ) 

are Hermitian operators. 

(b) Using the preceding result, show that every non-Hermitian operator may 
be written as a linear combination of two Hermitian operators. 

U and V are two arbitrary operators, not necessarily Hermitian. In the sense of 
Eq. 9.26, show that 

(i uvy = v f w. 

Note the resemblance to Eq. 4.124 for adjoint matrices. 

Hint. Apply the definition of adjoint operator — Eq. 9.26. 

Prove ^hat the product of two Hermitian operators is Hermitian (Eq. 9.25) if 
and only A the two operators commute. 

A and B are noncommuting quantum mechanical operators: 

AB — BA = iC. 

Show that C is Hermitian. Assume that appropriate boundary conditions are 
satisfied. 


The operator <£ is Hermitian. Show that <i? 2 > > 0. 


A quantum mechanical expectation value is defined by 


<A> = 


\j/*{x)A\l/(x) dx , 


where A is a linear operator. Show that demanding that <A> be real means that 
A must be Hermitian — with respect to \j/{x). 
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9.1.18 From the definition of adjoint, Eq. 9.26, show that A ft = A in the sense that 
j \p?A u ij/ 2 dx = j ip*A\l/ 2 dx. The adjoint of the adjoint is the original operator. 
Hint. The function \j/ 1 and \ jj 2 of Eq. 9.26 represents class of functions. The sub- 
scripts 1 and 2 may be interchanged or replaced by other subscripts. 

9.1 .1 9 The Schrodinger wave equation for the deuteron (with a Woods-Saxon potential) 
is 


2 M 


V 2 i// + 


1 + exp[(r - r 0 )/u] 


\jj = £i//. 


Here E ~ —2.224 MeV. a is a “thickness parameter,” 0.4 x 1CT 13 centimeters. 
Expressing lengths in fermis (10~ 13 centimeters) and energies in million electron 
volts (MeV), we may rewrite the wave equation as 


dl 

dr 2 


W) + 


1 

41.47 


E - 


1 + exp 



(ri//) = 0. 


E is assumed known from experiment. The game is to find V 0 for a specified value 
of r 0 , (say, r 0 = 2.1). If we let y(r) = nj/(r\ then y(0) = 0 and we take /(0) = 1. 
Find V 0 such that y(20.0) = 0. (This should be y(oo), but r = 20 is far enough 
beyond the range of nuclear forces to approximate infinity.) 

ANS. For a = 0.4 and r 0 = 2.1 fm., V 0 = -34.159 MeV. 

9 . 1 .20 Determine the nuclear potential well parameter V 0 of Exercise 9.1.19 as a function 
of r 0 for r = 2.00(0.05) 2.25 fermis. 

Express your results as a power law 

\V 0 \r ' 0 = k. 

Determine the exponent v and the constant k. This power law formulation is 
useful for accurate interpolation. 

9 . 1 .21 In Exercise 9.1.19 it was assumed that 20 fermis was a good approximation to 
infinity. Check on this by calculating F 0 for nf/(r) = 0 at (a) r ~ 15, (b) r ~ 20, 
(c) r = 25 and (d) r = 30. Sketch your results. Take r 0 = 2.10 and a = 0.4 (fermis). 

9 . 1 .22 For a quantum particle moving in a potential well, V(x) = }tfiw 2 x 2 , the Schrodin- 
ger wave equation is 

h 2 d 2 \j/(x) -1 2 2 / / \ \ 

~2m~ d x 2 ~ + 2 = 


or 


d 2 \j/(z) 

dz 2 


~ z 2 il/(z) = 


2E 

hw 




where z = (mco/h) ll2 x. Since this operator is even, we expect solutions of definite 
parity. For the initial conditions that follow integrate out from the origin and 
determine the minimum constant 2E/hto that will lead to ip(co) = 0 in each case. 
(You may take z = 6 as an approximation of infinity.) 

(a) For an even eigenfunction, 

*( 0 ) = 1 ., *'( 0 ) - 0 . 

(b) For an odd eigenfunction 

<A(0) = 0., *'(0)=1. 

Note. Analytical solutions appear in Section 13.1. 
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9.2 HERMITIAN (SELF-ADJOINT) OPERATORS 


Hermitian or self-adjoint operators have three properties that are of extreme 
importance in physics, both classical and quantum. 

1. The eigenvalues of an Hermitian operator are real. 

2. The eigenfunctions of an Hermitian operator are 
orthogonal. 

3. The eigenfunctions of an Hermitian operator form 
a complete set. 1 

Real Eigenvalues 

We proceed to prove the first two of these three properties. Let 

J Sfu f 4- /. i wu i = 0. (9.28) 

Assuming the existence of a second eigenvalue and eigenfunction 

<£uj 4- AjWUj = 0. (9.29) 

Then, taking the complex conjugate, we obtain 


JSfuf 4- Xfwuf = 0. (9.30) 

Here if is a real operator (p and q are real functions of x) and w(x) is a real 
function. But we permit X k9 the eigenvalues, and u k9 the eigenfunctions, to be 
complex. Multiplying Eq. 9.28 by uf and Eq. 9.30 by u { and then subtracting, 
we have 

Mf JSPtti - u t Seuf - (A? - X^wu { uf. (9.31) 

We integrate over the range a < x < b, 

Ui&uf dx = {If — Aj) f u t ufwdx. (9.32) 


uf^U:dx — 


Since if is Hermitian, the left-hand side vanishes by Eq. 9.26 and 

{Xf - X f ) f u-ufw dx = 0. (9.33) 

Ja 

If i = j 9 the integral cannot vanish [w(x) > 0, apart from isolated points], except 
in the trivial case u t = 0. Hence the coefficient (A* — A f ) must be zero, 

X* = X h (9.34) 


which is a mathematical statement that the eigenvalue is real. Since A f can 
represent any one of the eigenvalues, this proves the first property. This is an 
exact analog of the nature of the eigenvalues of real symmetric (and of Hermitian) 
matrices (compare Section 4.6). 


1 This third property is not universal. It does hold for our linear, second-order 
differential operators in Sturm- Liouville (seif-adjoint) form. Completeness 
is defined and discussed in Section 9.4. A proof that the eigenfunctions of our 
linear, second-order, self-adjoint, differential equations form a complete set 
may be developed from the calculus of variations of Section 17.8. 
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This reality of the eigenvalues of Hermitian operators has a fundamental 
significance in quantum mechanics. In quantum mechanics the eigenvalues 
correspond to precisely measurable quantities, such as energy and angular 
momentum. With the theory formulated in terms of Hermitian operators, this 
proof of the reality of the eigenvalues guarantees that the theory will predict 
real numbers for these measurable physical quantities. In Section 17.8 it will 
be seen that the set of real eigenvalues has a lower bound. 

Orthogonal Eigenfunctions 

If we now take i j and if =f= the integral of the product of the two 
different eigenfunctions must vanish. 

n b 

UiUfwdx = 0. (9.35) 

Ja 

This condition, called orthogonality, is the continuum analog of the vanishing 
of a scalar product of two vectors. 2 We say that the eigenfunctions n,(x) and 
Uj(x) are orthogonal with respect to the weighting function w(x) over the interval 
[< 2 ,fc]. Equation 9.35 constitutes a partial proof of the second property of our 
Hermitian operators. Again, the precise analogy with matrix analysis should 
be noted. Indeed, we can establish a one-to-one correspondence between this 
Sturm-Liouville theory of differential equations and the treatment of Hermitian 
matrices. Historically, this correspondence has been significant in establishing 
the mathematical equivalence of matrix mechanics developed by Heisenberg 
and wave mechanics developed by Schrodinger. Today, the two diverse ap- 
proaches are merged into the theory of quantum mechanics and the mathe- 
matical formulation that is more convenient for a particular problem is used 
for that problem. Actually the mathematical alternatives do not end here. 
Integral equations, Chapter 16, form a third equivalent and sometimes more 
convenient or more powerful approach. 

This proof of orthogonality is not quite complete. There is a loophole, 
because we may have i =£ j but still have X { = ly Such a case is labeled degenerate. 
Illustrations of degeneracy are given at the end of this section. If A,- = A,-, the 
integral in Eq. 9.33 need not vanish. This means that linearly independent 
eigenfunctions corresponding to the same eigenvalue are not automatically 
orthogonal and that some other method must be sought to obtain an orthogonal 
set. Although the eigenfunctions in this degenerate case may not be orthogonal, 
they can always be made orthogonal. One method is developed in the next 
section. 


2 From the definition of Riemann integral 

| f(x)g(x) dx = hm ^ £ /C*f)0 (* dj 

where x 0 = a, x N — b , and x t — = Ax. If we interpret /(x { ) and ^(x,) as 

the z'th components of an N component vector, then this sum (and therefore 
this integral) corresponds directly to a scalar product of vectors, Eq. 1.22. 
The vanishing of the scalar product is the condition for orthogonality of the 
vectors — or functions. 
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We shall see in succeeding chapters that it is just as desirable to have a given 
set of functions orthogonal as it is to have an orthogonal coordinate system. 
We can work with nonorthogonal functions, but they are likely to prove as 
messy as an oblique coordinate system 

EXAMPLE 9.2.1 Fourier Series: Orthogonality 


Continuing Example 9.1.3, the eigenvalue equation, Eq. 9.20, 

y( x ) + « 2 y(*) = 


perhaps describes a quantum mechanical particle in a box, perhaps a vibrating 
violin string with (degenerate) eigenfunctions — cos nx, sin nx. 

With n real (here taken to be integral), the orthogonality integrals become 


a. sin mx sin nx dx = C n S nm , 


rxn + 2 n 


cos mx cos nx dx = D n d nm , 


sin mx cos nx dx = 0. 


For an interval of 2k the preceding analysis guarantees the Kronecker delta in 
(a) and (b) but not the zero in (c) because (c) involves degenerate eigenfunctions. 
However, inspection shows that (c) always vanishes for all integral m and n. 

Our Sturm-Liouville theory says nothing about the values of C n and D n . 
Actual calculation yields 


CL = 



D„ = 



n i= 0, 
n — 0 , 

n + 0 , 
n = 0. 


These orthogonality integrals form the basis of the Fourier series developed 
in Chapter 14. 


EXAMPLE 9.2.2 Expansion in Orthogonal Eigenfunctions: Square Wave 


The property of completeness means that certain classes of function (i.e., 
sectionally or piecewise continuous) may be represented by a series of orthogonal 
eigenfunctions to any desired degree of accuracy. Consider the square wave 


/(*)= { 


h 

V 

h 

T 


0 < X < 71, 

— n < x < 0. 


(9.36) 
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This function may be expanded in any of a variety of eigenfunctions — Legendre, 
Hermite, Chebyshev, and so on. The choice of eigenfunction is made on the 
basis of convenience. To illustrate the expansion technique, let us choose the 
eigenfunctions of Example 9.2.1, cos nx and sin nx . 

The eigenfunction series is conveniently (and conventionally) written as 

a 00 

f{x) = -y + X ( a n cos nx + b n sin nx). 

2 n=1 

From the orthogonality integrals of Example 9.2. 1 the coefficients are given by 

i r 

a n = - f(t) cos ntdt , 

i r 

b n = - f(t) sin ntdt . , n — 0, 1, 2, .... 

J 71 

Direct substitution of ±h/2 for f(t) yields 

a„ = 0 , 

which is expected here because of the antisymmetry, and 


h f 0, 

= — (1 - cos to) = 

i nn ’ 


w even, 
w odd. 


Hence the eigenfunction (Fourier) expansion of the square wave is 

^g sin^ lj x (93?) 

7T „to (2« + 1) 

Additional examples, using other eigenfunctions, appear in Chapters 11 and 12. 


Degeneracy 

The concept of degeneracy was introduced earlier. If N linearly independent 
eigenfunctions correspond to the same eigenvalue, the eigenvalue is said to be 
AT-fold degenerate. A particularly simple illustration is provided by the eigen- 
values and eigenfunctions of the linear oscillator equation, Example 9.2.1. For 
each value of the eigenvalue n, there are two possible solutions: sin nx and cos nx 
(and any linear combination). We may say the eigenfunctions are degenerate or 
the eigenvalue is degenerate. 

A more involved example is furnished by the physical system of an electron in 
an atom (nonrelativistic treatment, spin neglected). From the Schrodinger 
equation, Eq. 13.53 for hydrogen, the total energy of the electron is our eigen- 
value. We may label it E nLM by using the quantum numbers n , L, and M as 
subscripts. For each distinct set of quantum numbers ( n , L, M) there is a distinct, 
linearly independent eigenfunction t/>„ LM (r, 0 , cp). For hydrogen, the energy E nLM 
is independent of L and M. With 0 < L < n — 1 and —L<M < L, the eigen- 
value is n 2 -fold degenerate (including the electron spin would raise this to 2n 2 ). 
In atoms with more than one electron the electrostatic potential is no longer a 
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simple r 1 potential. The energy depends on L as well as on n, although not on 
E nLM is still (2 L -f l)-fold degenerate. This degeneracy may be removed by 
applying an external magnetic field, giving rise to the Zeeman effect. 


EXERCISES 


9.2.1 The functions u^x) and u 2 (x) are eigenfunctions of the same Hermitian operator 
but for distinct eigenvalues k t and a 2 . Prove that u^x) and u 2 (x ) are linearly 
independent. 


9.2.2 (a) The vectors e„ are orthogonal to each other: e„-e„, = 0 for n f m. Show 

that they are linearly independent. 

(b) The functions i j/ n (x) are orthogonal to each other over the interval [a, b] and 
with respect to the weighting function w(x). Show that the i j/ n (x) are linearly 
independent. 


9.2.3 


Pfx) = x and Q 0 (x) = ^ln 

are solutions of Legendre’s differential equation corresponding to different 
eigenvalues. 

(a) Evaluate their orthogonality integral 


1 -f x 



1 +x 

1 -X 


dx. 


(b) Explain why these two functions are not orthogonal, why the proof of 
orthogonality does not apply. 


9.2.4 T 0 (x) = 1 and F 1 (x) = (l — x 2 ) 1/2 are solutions of the Chebyshev differential 

equation corresponding to different eigenvalues. Explain, in terms of the boundary 
conditions, why these two functions are not orthogonal. 


9.2.5 (a) Show that the first derivatives of the Legendre polynomials satisfy a self- 

adjoint differential equation with eigenvalue k — n(n -f 1) — 2. 

(b) Show that these Legendre polynomial derivatives satisfy an orthogonality 
relation 


J 1 P m (x)P’{x){l - x 2 )dx = 0, m + n. 

Note. In Section 12.5 (1 — x 2 ) ll2 P' n (x) will be labeled an associated Legendre 
polynomial, (x). 


9.2.6 A set of functions u„(x) satisfy the Sturm-Liouville equation 


A 

dx 


P(x)~u n (x) 

dx 


+ k n w(x)u n (x) = 0. 


The functions u m (x) and u n (x) satisfy boundary conditions that lead to orthogonal- 
ity. The corresponding eigenvalues k,„ and k m are distinct. Prove that for appro- 
priate boundary conditions u' m (x) and u^(x) are orthogonal with p(x) as a weighting 
function. 
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9 . 2.7 A linear operator A has n distinct eigenvalues and n corresponding eigenfunc- 
tions. Aifr; = X Show that the n eigenfunctions are linearly independent. A is 
not necessarily Hermitian. 

Hint . Assume linear dependence, that i j/ n = £"~i Use this relation and the 
operator-eigenfunction equation first in one order, then in the reverse order. 
Show that a contradiction results. 


9 . 2.8 A set of functions are mutually orthogonal. Show that they are automatically 
linearly independent, that orthogonality implies linear independence. 

9 . 2.9 The ultraspherical polynomials C^\x) are solutions of the differential equation 

j(l — x 2 )~ ~ (2a + l)x— + n ( n + 2a) j C * a) (x ) = 0. 

(a) Transform this differential equation into self-adjoint form. 

(b) Show that the C ( *\x) are orthogonal for different n. Specify the interval of 
integration and the weighting factor. 

Note. Assume that your solutions are polynomials. 


9 . 2.1 0 With j£f not self-adjoint, 


and 


(a) Show that 


ifW; -I" X iXVU; — 0 
££vj + XjWVj = 0 . 


provided 


f 


VjSfiii dx 


C b 

Uj&Vjdx, 

Ja 


UiPoVj 


= VjPoU'; 


l b 




and 


“i(Pl ~ Po)Vj 


= 0 . 


(b) Show that the orthogonality integral for the eigenfunctions w, and becomes 
J UiVjwdx ~ 0 (X t =/= Xj ). 

9 . 2.11 In Exercise 8.5.8 the series solution of the Chebyshev equation is found to be 
convergent for all n. Therefore n is not quantized by the argument used for 
Legendre (Exercise 8.5.4). Calculate the sum of the k = 0 Chebyshev series for 
n ~ v — 0.8, 0.9, and 1.0 and for x = 0.0(0. 1)0.9. 

Note. The Chebyshev series recurrence relation is given in Exercise 5.2.16. 


9 . 2.12 (a) Evaluate the n = v = 0.9, k = 0 Chebyshev series for x = 0.98, 0.99, and 
1.00. The series converges very slowly at x = 1.00. You may wish to use 
double precision. Upper bounds to the error in your calculation can be set 
by comparison with the v = 1.0 case which corresponds to (1 — x 2 ) 1/2 . 

(b) These series solutions for v = 0.9 and for v = 1.0 are obviously not or- 
thogonal despite the fact that they satisfy a self-adjoint eigenvalue equation 
with different eigenvalues. From the behavior of the solutions in the vicinity 
of x = 1.00 try to formulate a hypothesis as to why the proof of orthogonality 
does not apply. 
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9 . 2.13 The Fourier expansion of the (asymmetric) square wave is given by Eq. 9.37. With 
h~ 2, evaluate this series for x = 0(7 t/18)7e/2, using the first (a) 10 terms, (b) 100 
terms of the series. 

Note. For 10 terms and x = 7t/18 or 10° your Fourier representation has a sharp 
hump. This is the Gibbs phenomenon of Section 14.5. For 100 terms this hump 
has been shifted over to about 1°. 


9 . 2.1 4 The symmetric square wave 

fix) = 

has a Fourier expansion 


1 , 

- 1 , 


< x < n 


n „= 0 2n + 1 

Evaluate this series for x = 0(7i/18)7t/2 using the first 

(a) 10 terms, 

(b) 100 terms of the series. 

Note. As in Exercise 9.2.13, the Gibbs phenomenon appears at the discontinuity. 
This means that a Fourier series is not suitable for precise numerical work in the 
vicinity of a discontinuity. 


9.3 GRAM- SCHMIDT ORTHOGONALIZATION 

The Gram-Schmidt orthogonalization is a method that takes a nonor- 
thogonal set of linearly independent functions 1 and literally constructs an 
orthogonal set over an arbitrary interval and with respect to an arbitrary weight 
or density factor. In the language of linear algebra the process is equivalent to a 
matrix transformation relating an orthogonal set of basis vectors (functions) to 
a nonorthogonal set. A specific example of this matrix transformation appears 
in Exercise 12.2.1. The functions involved may be real or complex. Here for con- 
venience they are assumed to be real. The generalization to the complex case 
should offer little difficulty. 

Before taking up orthogonalization, we should consider normalization of 
functions. So far no normalization has been specified. This means that 

* b 

cpfwdx = N 2 , 

J a 

but no attention has been paid to the value of AT-. Since our basic equation, (Eq. 
9.6), is linear and homogeneous, we may multiply our solution by any constant 


^uch a set of functions might well arise from the solutions of a (partial) 
differential equation in which the eigenvalue was independent of one or more 
of the constants of separation. As an example, we have the hydrogen atom 
problem (Sections 9.2 and 13.2). The eigenvalue (energy) is independent of 
both the electron orbital angular momentum and its projection on the z-axis, 
m. The student should note, however, that the origin of the set of functions 
is irrelevant to the Gram-Schmidt orthogonalization procedure. 
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and it will still be a solution. We now demand that each solution <p,(x) be multi- 
plied by iVj -1 so that the new (normalized) q> t will satisfy 


cpf(x)w(x)dx = 1 


(9.38) 


(p i (x)cp j {x)w{x)dx = Sij. (9.39) 

Ja 

Equation 9.38 says that we have normalized to unity. Including the property of 
orthogonality, we have Eq. 9.39. Functions satisfying this equation are said to be 
orthonormal (orthogonal plus unit normalization). It should be emphasized that 
other normalizations are possible, and indeed, by historical convention, each of 
the special functions of mathematical physics treated in Chapters 12 and 13 will 
be normalized differently ! 

We consider three sets of functions : an original, given set u n (x), n — 0, 1, 2, ; 
an orthogonalized set to be constructed; and a final set of functions cp n (x ) 
which are the normalized ifr n ’s. The original u n ' s may be degenerate eigenfunc- 
tions, but this is not necessary. We shall have 


u„(x) 

Mx) 

<pjx) 

linearly independent 

linearly independent 

linearly independent 

nonorthogonal 

orthogonal 

orthogonal 

unnormalized 

unnormalized 

normalized 



(orthonormal) 


The Gram- Schmidt procedure is to take the nth \j/ function (\j/ n ) to be u n (x) 
plus an unknown linear combination of the previous (p's . The presence of the 
new u n (x) will guarantee linear independence. The requirement that ij/ n (x) be 
orthogonal to each of the previous (p's yields just enough constraints to deter- 
mine each of the unknown coefficients. Then the fully determined t j/„ will be 
normalized to unity, yielding cp n (x ). Then the sequence of steps is repeated for 

Starting with n = 0, let 

V> 0 (x) = « 0 (x) (9.40) 

with no “previous” <p’s to worry about. Normalizing, 


(p 0 (x) = 


iAo(x) 

[{i jjlwdx] 1 ' 2 ' 


(9.41) 


For n = 1, let 


>Ai(x) = «i(x) + a 10 (p 0 (x). (9.42) 

We demand that ^(x) be orthogonal to <p 0 {x). (At this stage the normalization 
of iAi(x) is irrelevant.) This demand of orthogonality leads to 
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ij/ i (p 0 wdx = u 1 cp 0 wdx + a 10 cpoxvdx. 


Since cp 0 is normalized to unity (Eq. 9.41), we have 

a 10 = — f u l (p 0 wdx , 


fixing the value of a 10 . Normalizing, we define 

„ _ !^iW 


<Pi(x) = 


(J>jWx) 1/2 ‘ 


Generalizing, we have 


(J \ l/?(x)w(x)dx) i/2 


where 


'l'i(x) = u i + a i0 cp 0 + a ul q> x + ■ • • + a M _ 
The coefficients are given by 

an = — f UfCPjWdx. 


Equation 9.48 is for unit normalization. If some other normalization is 
selected, then 


\_(pj(x)Yw(x)dx — Nf. 


Equation 9.46 is replaced by 


(Pi{x) = iV f 


(Jl j/fwdx) 112 


and a,-,- becomes 


j UiCpjwdx 


(9.46a) 


(9.48a) 


Equations 9.47 and 9.48 may be rewritten in terms of projection operators, 
Pj. If we consider the cp n (x ) to form a linear vector space, then the integral in 
Eq. 9.48 may be interpreted as the projection of u x into the (pj “coordinate” or 
the j th component of u v With 

PjUi(x) = { f u t (t)(Pj(t)w(t)dtl(pj{x), 


Eq. 9.47 becomes 
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•AiM = 


l - X Pj\^(x). 

i = l 


(9.47a) 


Subtracting off the jth components,) = 1 to i — 1 leaves i//,(x) orthogonal to all 
the <py(x). 

It will be noticed that although this Gram-Schmidt procedure is one possible 
way of constructing an orthogonal or orthonormal set, the functions (p { (x) are 
not unique. There is an infinite number of possible orthonormal sets for a given 
interval and a given density function. As an illustration of the freedom involved, 
consider two (nonparallel) vectors A and B in the xy-plane. We may normalize 
A to unit magnitude and then form B' = aA + B so that B' is perpendicular to 
A. By normalizing B' we have completed the Gram-Schmidt orthogonalization 
for two vectors. But any two perpendicular unit vectors such as i and j could 
have been chosen as our orthonormal set. Again, with an infinite number of 
possible rotations of i and j about the z-axis, we have an infinite number of 
possible orthonormal sets. 


EXAMPLE 9.3.1 Legendre Polynomials by Gram-Schmidt Orthogonali- 
zation 


Let us form an orthonormal set from the set of functions u n (x) = x", n = 
0, 1, 2, . . . . The interval is — 1 < x < 1 and the density function is w(x) = 1. 

In accordance with the Gram-Schmidt orthogonalization process described, 


u 0 = 1 and <p 0 = 


Then 


and 


<h(x) = x + a l0 


J 

V2 


a xo = ~ 


-i 


dx = 0 


by symmetry. Normalizing i/q, we obtain 


= ^2 X ' 

Continuing the Gram-Schmidt process, we define 

1 


'l'2(x) = x + a 20 -= + a 2l /-X, 


where 




1 X 2 . y/2 

fy dx - 3 ’ 

-l V 2 J 


(9.49) 

(9.50) 

(9.51) 

(9.52) 

(9.53) 


a 20 — ~~ 


(9.54) 
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a 2 1 — — 


-x 3 dx — 0, 
1 2 


again by symmetry. Therefore 


•M x ) = X 2 - 


(9.55) 


(9.56) 


and, on normalizing to unity, we have 

cp 2 (x) = ^^Px 2 - 1). ( 9 -57) 

The next function cp 3 (x) is 

<p 3 (x) = - 3x )- (9-58) 

Reference to Chapter 12 will show that 

<PnM = (9 ' 59) 

where P n (x) is the nth-order Legendre polynomial. Our Gram-Schmidt process 
provides a possible but very cumbersome method of generating the Legendre 
polynomials. 

The equations for Gram-Schmidt orthogonalization tend to be ill-condi- 
tioned because of the subtractions. A technique for avoiding this difficulty using 
the polynomial recurrence relation is discussed by Hamming 2 . 

In Example 9.3.1 we have specified an orthogonality interval [—1, 1], a unit 
weighting function, and a set of functions, x n , to be taken one at a time in 
increasing order. Given all these specifications the Gram-Schmidt procedure is 
unique (to within a normalization factor and an overall sign as discussed sub- 
sequently). Our resulting orthogonal set, the Legendre polynomials, P 0 up 
through P n9 form a complete set for the description of polynomials of order <n 
over [— 1, 1]. This concept of completeness is taken up in detail in Section 9.4. 
Expansions of functions in series of Legendre polynomials are found in Section 
12.3. 


Orthogonal Polynomials 

This particular example has been chosen strictly to illustrate the Gram- 
Schmidt procedure. Although it has the advantage of introducing the Legendre 
polynomials, the initial functions u n = x n are not degenerate eigenfunctions and 
are not solutions of Legendre’s equation. They are simply a set of functions that 
we have here rearranged to create an orthonormal set for the given interval and 
given weighting function. The fact that we obtained the Legendre polynomials 
is not quite black magic but a direct consequence of the choice of interval and 


2 R. W. Hamming, Numerical Methods for Scientists and Engineers , 2nd ed. 

New York: McGraw-Hill (1973). See Section 27.2 and references given there. 
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TABLE 9.4 Orthogonal Polynomials Generated by 
Gram-Schmidt Orthogonalization of u n (x) = x", n = 0, 1,2, 


Polynomials 

Interval 

Weighting 

function 

w(x) Standard normalization 


Legendre 

-1 ^x< 1 

j 

\ } p ^ ldx = 2,h\ 


Shifted 

Legendre 

0 < x < 1 

J 

[W-sVt 

0 


Chebyshev I 

-1 < x < 1 

(1 - x 2 )-' n J 

P [r„(x)] 2 (l - x 2 r m dx = j” /2 ' 

n± 0 

n = 0 

Shifted 

Chebyshev I 

0 < x < 1 

[x(l - X)]- 1 ' 2 j 

[T*(x)] 2 [x(l — x)Y' l2 dx = 1 ^ 

l 0 u 

n > 0 

n = 0 

Chebyshev II 

-1 < x < 1 

(1 - X 2 ) 1 ' 2 J 

P [U„(x)] 2 (l - x 2 ) ul dx = ~ 

-1 


Laguerre 

0 < x < oo 

J 

[L„(x)] 2 e~ x dx = 1 


Associated 
“ Laguerre 

0 < X < 00 

j 

| °° [L k „(x)] 2 x k e x dx - 
0 


Hermite 

— oo < x < oo 

J 

r°° 2 

[N n (x)] 2 e x dx = 2"7r 1/2 w ! 



weighting function. The use of u n (x) = x n but with other choices of interval and 
weighting function leads to other sets of orthogonal polynomials as shown in 
Table 9.4. We consider these polynomials in detail in Chapters 12 and 13 as 
solutions of particular differential equations. 

An examination of this orthogonalization process will reveal two arbitrary 
features. First, as emphasized before, it is not necessary to normalize the func- 
tions to unity. In the example just given we could have required 

| (Pn(x)<p m (x)dx = ^~^-jd nm , (9.60) 

and the resulting set would have been the actual Legendre polynomials. Second, 
the sign of cp n is always indeterminate. In the example we chose the sign by 
requiring the coefficient of the highest power of x in the polynomial to be posi- 
tive. For the Laguerre polynomials, on the other hand, we would require the 
coefficient of the highest power to be (—1)7 n ! 


EXERCISES 


9.3.1 


Rework Example 9.3.1 by replacing cp n (x ) by the conventional Legendre polyno- 
mial, P n {x). 


\_P n (x)fdx 


2 

2n + 1 ' 


Using Eqs. 931a, 9.46 a, and 9.48a, construct P 0 , P,(x), and P 2 (x). 
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ANS. 



9.3.2 Following the Gram-Schmidt procedure, construct a set of polynomials P*(x) 
orthogonal (unit weighting factor) over the range [0, 1] from the set {l,x}. Nor- 
malize so that P*(l) = 1. 

ANS . P 0 *(x) = 1, 

P*(x) = 2x — 1, 

P 2 *(x) = 6x 2 - 6x + 1, 

P 3 *(x) = 20x 3 - 30x 2 + 12x - 1. 
These are the first four shifted Legendre polynomials. 

Note. The “*” is the standard notation for “shifted”: [0, 1] instead of [— 1, 1]. It ’ 
does not mean complex conjugate. 


9.3.3 Apply the Gram-Schmidt procedure to form the first three Laguerre polynomials 

u„(x) = x n , n = 0, I, 2, , 

0 < x < oo, 
w(x) = e~ x . 

The conventional normalization is 


Jo 


L m (x)L„(x)e~ x dx = S m 


ANS. L 0 = 1, 

L t = 1 — x, 

, (2-4x + x 2 ) 

Li — r • 


9.3.4 You are given 

(a) a set of functions u n (x) = x n ,n = 0, 1, 2, . . . , 

(b) an interval (0, oo), 

(c) a weighting function w(x) = xe~ x . 

Use the Gram-Schmidt procedure to construct the first three orthonormal functions 
from the set w„(x) for this interval and this weighting function. 

ANS. cp 0 (x) = 1, 

9i(x) = (x - 2)A/2, 

( p 2 (x) = (^ 2 + 6)/2 > /3. 


9.3.5 


Using the Gram-Schmidt orthogonalization procedure, construct the lowest three 
Hermite polynomials : 

u n (x) = x", n = 0, I, 2, . . . — oo < x < oo, w(x) = e~* 2 . 

For this set of polynomials the usual normalization is 


H m (x)H„{x)w{x)dx = d mn 2"‘m\n'< 2 . 

ANS. H 0 = 1, 

H x = 2x, 

H 2 = 4x 2 - 2. 


9.3.6 Use the Gram-Schmidt orthogonalization scheme to construct the first three 
Chebyshev polynomials (type I). 

u n (x) = x", n = 0, 1, 2, . . . 


- 1 < X < 1, w(x) = (1 - x 2 ) 1/2 . 
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Take the normalization 


(n, m = n = 0 

TJx)T n (x)w(x)dx = S m J n _ 

J-i 2* 

Hint. The needed integrals are given in Exercise 10.4.3. ANS. T 0 = 1, 

7, = x, 

T z = 2x 2 - 1, 
(73 - 4x 3 - 3x). 


9 . 3.7 Use the Gram-Schmidt orthogonalization scheme to construct the first three 
Chebyshev polynomials (type II). 

u n (x) = x”, n = 0, 1, 2, . . . - 1 < x < 1, w(x) = (1 - x 2 ) +,/2 . 

Take the normalization to be 


Hint. 


r i 


-i 


t7 m (x)l/„(x)w(x)dx = 


r 1 


(1 — x 2 ) 1/2 x 2 "dx = - x 
2 


1-3-5 — (2ft — 1) 
4*6*8 • • ■ (2 n + 2 ;)’ 


7t 

2’ 


n = 0. 


«= 1,2,3, ... 


/INS. U 0 = 1, 

U; = 2x, 

L/ 2 = 4x 2 - 1. 


9 . 3.8 As a modification of Exercise 9.3.5, apply the Gram-Schmidt orthogonalization 
procedure to the set u„(x) = x", n = 0, 1, 2, . . . , 0 < x < oo. Take w(x) to be 
exp[ — x 2 ]. Find the first two nonvanishing polynomials. Normalize so that the 
coefficient of the highest power of x is unity. In Exercise 9.3.2 the interval ( — oo, oo) 
led to the Hermite polynomials. These are certainly not the Hermite polynomials. 

ANS. <p 0 = l 


9 . 3.9 Form an orthogonal set over the interval 0 < x < oo, using w„(x) = e~ ttX , n = 1 , 
2, 3, ... . Take the weighting factor, w(x), to be unity. These functions are solutions 
of u" — n 2 u n = 0, which is clearly already in Sturm-Liouville (self-adjoint) form. 
Why doesn’t the Sturm-Liouville theory guarantee the orthogonality of these 
functions? 


9.4 COMPLETENESS OF EIGENFUNCTIONS 

The third important property of an Hermitian operator is that its eigen- 
functions form a complete set. This completeness means that any well-behaved 
(at least piecewise continuous) function F(x) can be approximated by a series 

00 

F(x) = X a„(p„(x) 

n = 0 


(9.61) 
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to any desired degree of accuracy. 1 More precisely, the set (p n (x ) is called com- 
plete 2 if the limit of the mean square error vanishes; 


lim 

m-> oo 


F(x) - £ a„(p„(x) 


n = 0 


w(x)dx = 0. 


(9.62) 


Technically, the integral here is a Lebesgue integral. We have not required that 
the error vanish identically in [a, h] but only that the integral of the error squared 
go to zero. 

This convergence in the mean, Eq. 9.62, should be compared with uniform 
convergence, Section 5.5, Eq. 5.67. Clearly, uniform convergence implies con- 
vergence in the mean but the converse does not hold; convergence in the me^n 
is less restrictive. Specifically, Eq. 9.62 is not upset by piecewise continuous 
functions, a finite number of finite discontinuities. Equation 9.62 is perfectly 
adequate for our purposes and is far more convenient than Eq. 5.67. Indeed, 
since we frequently use eigenfunctions to describe discontinuous functions, 
convergence in the mean is all we can expect. 

In the language of linear algebra, we have a linear space, a function space. The 
linearly independent, orthonormal functions <p n (x), form the basis for this 
(infinite-dimensional) space. Equation 9.61 is a statement that the functions 
c p n (x ) span this linear space. With an inner product defined by Eq. 9.64, our linear 
space is a Hilbert space. 

The question of completeness of a set of functions is often determined by 
comparison with a Laurent series, Section 6.5. In Section 14.1 this is done for 
Fourier series, thus establishing the completeness of Fourier series. For all 
orthogonal polynomials mentioned in Section 9.3 it is possible to find a poly- 
nomial expansion of each power of z, 


Z" = I a,P,(z), (9.63) 

i=0 

where P { (z) is the ith polynomial. Exercises 12.4.6, 13.1.8, 13.2.5, and 13.3.22 are 
specific examples of Eq. 9.63. Using Eq. 9.63, we may reexpress the Laurent 
expansion of/(z) in terms of the polynomials, showing that the polynomial 
expansion exists (and existing, it is unique, Exercise 9.4.1). The limitation of this 
Laurent series development is that it requires the function to be analytic. 
Equations 9.61 and 9.62 are more general. F(x) may be only piecewise con- 
tinuous. Numerous examples of the representation of such piecewise continuous 
functions appear in Chapter 14 (Fourier series). A proof that our Sturm- 
Liouville eigenfunctions form complete sets appears in Courant and Hilbert. 3 
In Eq. 9.61 the expansion coefficients a m may be determined by 


1 If we have a finite set, as with vectors, the summation is over the number of 
linearly independent members of the set. 

2 Many authors use the term closed here. 

3 R. Courant and D. Hilbert, Methods of Mathematical Physics, Vol. 1 
(English translation). New York: Interscience Publishers (1953), Chapter 6, 
Section 3. 
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Ex. 9.4.2 Ex. 9.4.1 

FIG. 9.2 Linear independence, orthogonality, and uniqueness. 


a 


m 


*b 

F(x)cp m (x)w(x)dx. 

J a 


(9.64) 


This follows from multiplying Eq. 9.61 by <p m (x)w(x) and integrating. From the 
orthogonality of the eigenfunctions, <p n (x), only the mth term survives. Here we 
see the value of orthogonality. Equation 9.64 may be compared with the dot or 
inner product of vectors, Section 1.3, and a m interpreted as the mth projection of 
the function F(x). Often the coefficient a m is called a generalized Fourier 
coefficient. 

For a known function, F(x\ Eq. 9.64 gives a m as a definite integral which can 
always be evaluated, by machine if not analytically. 

For examples of particular eigenfunction expansions, see the following: 
Fourier series,’ Section 9.2 and Chapter 14; Bessel and Fourier- Bessel expan- 
sions, Section 11.2; Legendre series, Section 12.3; Laplace series. Section 12.6; 
Her mite series. Section 13.1 ; Laguerre series, Section 13.2; and Chebyshev series, 
Section 13.3. 

It may also happen that the eigenfunction expansion, Eq. 9.61, is the expan- 
sion of an unknown F{x) in a series of known eigenfunctions cp n (x) with unknown 
coefficients a n . An example would be the quantum chemist’s attempt to describe 
an (unknown) molecular wave function as a linear combination of known 
atomic wave functions. The unknown coefficients a n would be determined by a 
variational technique — Rayleigh-Ritz, Section 17.8. 

The relationships among eigenfunctions, orthogonal sets of functions, 
linearly independent sets of functions, and uniqueness of representations are 
presented schematically in Fig. 9.2. 
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Bessel's Inequality 

If the set of functions (p n (x) does not form a complete set, possibly because we 
simply have not included the required infinite number of members of an infinite 
set, we are led to Bessel’s inequality. First, consider the finite case. Let A be an n 
component vector, 


A = + e 2 « 2 4- * • • 4- e n a„, (9.65) 

in which e f is a unit vector and a { is the corresponding component (projection) 
of A, that is, 

a { = A • e f . (9.66) 


Then 



(9.67) 


If we sum over all n components, clearly, the summation equals A by Eq. 9.65 
and the equality holds. If, however, the summation does not include all n com- 
ponents, the inequality results. By expanding Eq. 9.67 and remembering that the 
unit vectors satisfy an orthogonality relation. 


we have 


e ; *e f 


&ii. 


(9.68) 


A 2 > 



(9.69) 


This is Bessel’s inequality. 

For functions we consider the integral 

2 

w(x)dx > 0. 


Cb 


fix) ~ X>i<Pi(*) 


(9.70) 


This is the continuum analog of Eq. 9.67, letting n -» oo and replacing the sum- 
mation by an integration. Again, with the weighting factor w(x) > 0, the inte- 
grand is nonnegative. The integral vanishes by Eq. 9.61 if we have a complete 
set. Otherwise it is positive. Expanding the squared term, we obtain 


lf{x)] 2 w{x)dx - 2 f{x)cp i (x)w(x)dx 4- Y, a i ^ ( 9 * 71 ) 


Applying Eq. 9.64, we have 


[ [/(x)] 2 w(x)rfx > Y'Of. (9.72) 

J a 1 

Hence the sum of the squares of the expansion coefficients a { is less than or equal 
to the weighted integral of [/(x)] 2 , the equality holding if and only if the expan- 
sion is exact, that is if the set of functions <p n (x) is a complete set. 

In later chapters when we consider eigenfunctions that form complete sets 
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(such as Legendre polynomials), Eq. 9.72 with the equal sign holding will be 
called a Parseval relation. 

Bessel’s inequality has a variety of uses, including proof of convergence of the 
Fourier series. 

Schwarz Inequality 

The frequently used Schwarz inequality is similar to the Bessel inequality. 
Consider the quadratic equation 

i (««* + bd 2 = t af(x + b-Jaf = 0. (9.73) 

1=1 i=l 

If bi/di = constant, c, then the solution is x = — c. If bja i is not a constant, all 
terms cannot vanish simultaneously for real x. So the solution must be complex. 
Expanding, we find that 

x 2 jr a? + 2x £ a,b, + = 0, (9.74) 

i i i 

and since x is complex (or — — bja f ), the quadratic formula 4 for x leads to 

,9 751 

the equality holding when b i /a i equals a constant. 

Once more, in terms of vectors, we have 

(a • b) 2 = a 2 b 2 cos 2 6 < a 2 b 2 , (9.76) 

where 0 is the included angle. 

The Schwarz inequality for functions has the form 

f f*(x)g(x)dx < f /*(x)/(x)rfx f g*(x)g(x) dx, (9.77) 

Ja Ja Ja 

the equality holding if and only if g(x) — a/(x), a being a constant. To prove this 
function form of the Schwarz inequality, 5 consider a complex function ij/(x) = 
f{x) + kg(x) with X a complex constant. The functions /(x) and #(x) are any 
two functions (for which the integrals exist). Multiplying by the complex 
conjugate and integrating, we obtain 

dx = f f*fdx + X f f*gdx + X* f g*fdx + XX* f g*gdx > 0. 

Ja Ja Ja Ja Ja 

(9.78) 

The >0 appears since is nonnegative, the equal ( = ) sign holding only 
if i^(x) is identically zero. Noting that X and X* are linearly independent, we 

4 With discriminant b 2 — 4ac negative (or zero). 

5 An alternate derivation is provided by the inequality 


SfLfMff(y) ~f(y)g(x)rif(x)g(y) -f(y)g(x)ldxdy > 0. 
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differentiate with respect to one of them and set the derivative equal to zero 
to minimize J ' b a \j/*il/ dx: 


8 

8X* 


'b 

\j/*i/jdx 

Ja 


g*fdx + X g*gdx — 0. 


This yields 


k = i b a 9*fdx 

$g*gdx’ 


(9.79 a) 


Taking the complex conjugate, we obtain 

Ja g*9 dx 


(9.79 b) 


Substituting these values of X and 2* back into Eq. 9.78, we obtain Eq. 9.77, 
the Schwarz inequality. 

In quantum mechanics f(x) and g(x) might each represent a state or con- 
figuration of a physical system. Then the Schwarz inequality guarantees that 
the inner product \ b a f*(x)g(x)dx exists . In some texts the Schwarz inequality 
is a key step in the derivation of the Heisenberg uncertainty principle. 

The function notation of Eqs. 9.77 and 9.78 is relatively cumbersome. In 
advanced mathematical physics and especially in quantum mechanics it is 
common to use a different notation: 


<f\g> = 


f*(x)g(x)dx. 


Using this new notation, we simply understand the range of integration, ( a , b ), 
and any weighting function. In this notation the Schwarz inequality becomes 

\<f\g>\ 2 <<f\f><g\g>- (9.77a) 

If g(x) is a normalized eigenfunction, <p f (x), Eq. 9.77 yields [here w(x) = 1] 


afa i < f*(x)f(x)dx. 


(9.80) 


a result that also follows from Eq. 9.72. 


Dirac Delta Function 

Let us assume that we have a complete, orthonormal set of real functions, 
<p„(x), and use them to represent the Dirac delta function. We assume an expan- 
sion of the form 


00 

<5(x - 0 = Z a «(0<P„(x) (9-81) 

n — 0 

(Eq. 9.61), with the coefficients a n functions of the variable t. Multiplying by 
(p m {x) and integrating over the orthogonality interval (Eq. 9.64), we have 
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t)<p m (x)dx = (pjt) 


(9.82) 


or 

00 

5{x - 0 = £ = Ht - *)• (9.83) 

n — 0 

(For convenience we assume that cp n (x ) has been redefined to include [w(x)] 1/2 
if w(x) =£ 1.) This series in Eq. 9.83 is assuredly not uniformly convergent, but 
it may be used as part of an integrand in which the ensuing integration will 
make it convergent (compare Section 5.5). 

Suppose we form the integral 

f F(t)S(t — x)dx , 


where it is assumed that F(t) can be expanded in a series of eigenfunctions, 
cp p (t). We obtain 


F(t)S(t -x)dx = £ £ <Pn(x)<p n (t)dt 


= 0 n = 0 


£ a p<Pp(x) = 

P = 0 


the cross products <p p q> n (n ^ p ) vanishing by orthogonality (Eq. 9.39). Referring 
back to the definition of the Dirac delta function (Sections 1.15 and 8.7), we see 
that our series representation, Eq. 9.83, satisfies the defining property of the 
Dirac delta function and therefore is a representation of it. This representation 
of the Dirac delta function is called closure. The assumption of completeness 
of a set of functions for expansion of 3(x — t) yields the closure relation. The 
converse, that closure implies completeness, is the topic of Exercise 9.4.10. 


Green's Function 

A series somewhat similar to that representing S(x — t) results when we 
expand the Green’s function in the eigenfunctions of the corresponding homo- 
geneous equation. In the inhomogeneous Helmholtz equation we have 

V 2 ^(r) + ^(r)= — p(r). (9.85) 

The homogeneous Helmholtz equation is satisfied by its eigenfunctions cp n . 

V 2 <p„(r) + k 2 n <p n ( r) = 0. (9.86) 

As outlined in Section 8.7, the Green’s function G(r x ,r 2 ) satisfies the point 
source equation 

V 2 G(r l 9 r 2 ) + k 2 G(r 1 ? r 2 ) = -^(r, - r 2 ). (9.87) 

We expand the Green’s function in a series of eigenfunctions of the homogeneous 
equation (9.86), that is, 
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G(r l5 r 2 )= £ a„(r 2 )%(ri), (9.88) 

n = 0 

and by substituting into Eq. 9.87 obtain 

00 00 00 

- Z + fc 2 z a n ( r 2 )<Pn(ri) = ~ z <fU r i)<M r 2 )- (9.89) 

n — 0 h = 0 n - 0 

Here S{ r x — r 2 ) has been replaced by its eigenfunction expansion, Eq. 9.82. 
When we employ the orthogonality of cp n { r A ) to isolate a n and then substitute 
into Eq. 9.88, the Green’s function becomes 


00 


G( rj,r 2 )= Z 

n — 0 


<ft.(ri)<ft,(r 2 ) 

K-k 2 ’ 


(9.90) 


a bilinear expansion, symmetric with respect to iq and r 2 as expected. Finally, 
ifs (r j ), the desired solution of the inhomogeneous equation, is given by 


•A( r i) 


G(r,,r 2 )p(r 2 )dT 2 . 


(9.91) 


If we generalize our inhomogeneous differential equation to 

+ li ]/ = —p 

where jSf is an Hermitian operator, we find that 

G(iq,r 2 ) = £ 

n = 0 ^ 


(9.92) 

(9.93) 


where is the nth eigenvalue and <p„, the corresponding orthonormal eigen- 
function of the homogeneous differential equation 


+ 34 = 0. (9.94) 

The Green’s function will be encountered again in Section 16.5, in which we 
investigate it in more detail and relate it to integral equations. 

Summary — Linear Vector Spaces — 

Completeness 

Here we summarize some properties of linear vector space, first with the 
vectors taken to be the familiar real vectors of Chapter 1 and then with the 
vectors taken to be ordinary functions — polynomials. The concept of complete- 
ness is developed for finite vector spaces and carried over into infinite vector 
spaces. 

lv. We shall describe our linear vector space with a set of n linearly inde- 
pendent vectors e f , i — 1, 2, . . . , n. If n — 3, = i, e 2 = j, and e 3 = k. The ne { 

span the linear vector space. 

If. We shall describe our linear vector (function) space with a set of n 
linearly independent functions, <p f (x), i = 0,1, . . . , n — L The index i starts 
with 0 to agree with the labeling of the classical polynomials. Here (p^x) is 
assumed to be a polynomial of degree i. The ncp^x) span the linear vector 
(function) space. 
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2v. The vectors in our linear vector space satisfy the following relations 
(Section 1.2; the vector components are numbers): 


a. 

b. 

c. 

d. 


e. 


Vector addition is commutative 
Vector addition is associative 
There is a null vector 
Multiplication by a scalar 
Distributive 
Distributive 
Associative 
Multiplication 
By unit scalar 
By zero 

Negative vector 


u+v=v+u 

[u + v] + w — u + [v + w] 

0 + v = v 

a[u + v] = au -f- a\ 

(a + b) u = au + bu 
a[bu ] = (ab) u 

lu = u 

Ou — 0 
(-l)u = -u 


2f. The functions in our linear function space satisfy the properties listed 
for vectors (substitute “function” for “vector”). 

Ax) + g(x) = g(x) + f(x) 

[/(*) + gf(x)] + h(x) = f{x) + [fii(x) + h(x )] 

0 + f(x) = fix) 

a[Jix) + gix)] = afix) + ag(x) 
ia + b)fix) = afix) + bf(x) 
a\_bfix)] = iab)J\x) 

1 'Ax) = f(x) 

0*/(x) = 0 
(-l)-/(x)= -fix) 

3v. In ^-dimensional vector space an arbitrary vector c is described by its 
n components (c l5 c 2 , . . . , c n ) or 

C = £ c i e i‘ 

i=l 

When (1) ne t are linearly independent and (2) span the n-dimensional vector 
space, then the form a basis and constitute a complete set. 

3f. In n-dimensional function space a polynomial of degree m < n — 1 is 
described by 

Ax) = I Ci<Pi(x). 

i = 0 

When (1) the ncp^x) are linearly independent and (2) span the rc-dimensional 
function space, then the cp^x) form a basis and constitute a complete set (for 
describing polynomials of degree m < n — 1). 

4v. An inner product (scalar, dot product) is defined by 
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c-d=^c,.rf ; . 

i = 1 

(If c and d have complex components, the inner product is defined as ]T" =1 cfd { .) 
The inner product has the properties of 


a. 

Distributive law of addition 

c*(d + e) = c*d + c*e 

b. 

Scalar multiplication 

c * ad — ac • d 

c. 

Complex conjugation 

c * d = (d • c)* 


4f. An inner product is defined by 

<f\0> = f f*(x)g(x)w(x)dx. 

Ja 

The choice of the weighting function w(x) and the interval (a, b) follows from 
the differential equation satisfied by cp^x) and the boundary conditions — Section 
9.1. In matrix terminology, Section 42, \g} is a column vector and </| is a row 
vector, the adjoint of | />. 

The inner product has the properties listed for vectors: 

a. <f\g + h) = (f\g} + </|/i> 

b. </|agf> = a(f\g) 

c- </jfif> = <g\f>* 


5v. Orthogonality. 

e / * ©/ — 0, 

If the ne, are not already orthogonal, the Gram-Schmidt process may be used 
to create an orthogonal set. 

5f. Orthogonality. 

<<P;|<Pj> = ) = 0, / + j- 

J a 

If the mp^x) are not already orthogonal, the Gram-Schmidt process (Section 
9.3) may be used to create an orthogonal set. 

6v. Definition of norm. 

/ n \ 1/2 

|c| = (c-c ) i/2 = (yc, 2 J . 

The basis vectors e f are taken to have unit norm (length) e ( • e, = 1. The com- 
ponents of c are given by 

Cj- c,- * c, i 1,2,..., n. 


6f. Definition of norm. 

ll/ll = </|/> 1/2 = 


\f(x)\ 2 w(x)dx 


1/2 


Z N 

i= 0 


1/2 
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Parseval’s identity. ||/|| > 0 unless /(x) is identically zero. The basis functions 
(pi(x) may be taken to have unit norm (unit normalization), 


The expansion coefficients of our polynomial /(x) are given by 

c i = i = 0 , 1 , 1 . 

7v. Bessel’s inequality. 

c-c>Icf. 

I 

If the = sign holds for all c, it indicates that the e, span the vector space; that is, 
they are complete. 

7f. Bessel’s inequality. 

<f\f> = f \f(x)\ 2 w{x)dx > Zl c .| 2 - 
Ja i 

If the equal sign holds for all allowable /’ s, it indicates that the <p,(x) span the 
function space, that is, they are complete. 

8v. Schwarz inequality. 

c-d < |c| • |d|. 

The equal sign holds when c is a multiple of d. If the angle included between c 
and d is 6, then |cos0| < 1. 

8f. Schwarz inequality. 

i</|<7>| < <f\fy ,2 <g\g> 112 = ll/IHMI- 

The equals sign holds when/(x) and g(x) are linearly dependent, that is, when 
f(x) is a multiple of g(x). 

Now, let n -» oo, forming an infinite-dimensional linear vector space, l 2 . 

9 \. In an infinite-dimensional space our vector c is 

00 

C=Z c i e f 

i = 1 

We require that 

£ cf < OO. 

£ = 1 

The components of c are given by 

= e-c, i= 1, 2, ... , oo, 

exactly as in a finite-dimensional space. 

Then let n oo, forming an infinite-dimensional linear vector (function) 
space, L 2 . Then L stands for Lebesgue, the superscript 2 for the 2 in |/(x)| 2 . 
Our functions need no longer be polynomials but we do require that /(x) be 
at least piecewise continuous (Dirichlet conditions for Fourier series) and that 
</|/> = \f(x)\ 2 w(x)dx exist. This latter condition is often stated as a require- 

ment that/(x) be square integrable. 
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9f. Cauchy sequence. 

Let 


fjx) = x c i<PiW- 

i=0 

If 

||/W - fn(x)\\ -*■ 0 as n -*• GO 


or 


lim 

h-^oo 



2 

w(x) dx = 0, 


then we have convergence in the mean. This is analogous to the partial sum — 
Cauchy sequence criterion for the convergence of an infinite series, Section 5. 1. 

If every Cauchy sequence of allowable vectors (square integrable, piecewise 
continuous functions) converges to a limit vector in our linear space, the space 
is said to be complete. Then 


00 

f(x) — Y c i<Pi( x ) (almost everywhere) 

i=0 

in the sense of convergence in the mean. As noted before, this is a weaker require- 
ment than point-wise convergence (fixed value of x) or uniform convergence. 


Expansion (Fourier) Coefficients 

c i = <<P;|/>, i = 0 , 1 , . . 00, 
exactly as in a finite-dimensional space. Then 

f(x) = X<<P;|/><P;(x)- 

i 

A linear space (finite- or infinite-dimensional) that (1) has an inner product 
defined (< f\g >) and (2) is complete is a Hilbert space. 

Infinite-dimensional Hilbert space provides a natural mathematical frame- 
work for modern quantum mechanics. Away from quantum mechanics, Hilbert 
space retains its abstract mathematical power and beauty but the necessity for 
its use is reduced. 


EXERCISES 

9.4.1 A function f(x ) is expanded in a series of orthonormal eigenfunctions 

00 

f(x) = X a nV„{x)- 

n = 0 

Show that the series expansion is unique for a given set of (p n (x). The functions 
<p n {x) are being taken here as the basis vectors in an infinite dimensional Hilbert 
space. 

9.4.2 A function f(x) is represented by a finite set of basis functions <p,-(x). 
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N 

/(*) = Y c i<p >■(*)■ 

1=1 

Show that the components c ( are unique, that no different set c \ exists. 

Note. Your basis functions are automatically linearly independent. They are 
not necessarily orthogonal. 


9 . 4.3 A function /(x) is approximated by a power series Yj=o c i xi over the interval 
[0, 1]. Show that minimizing the mean square error leads to a set of linear equa- 
tions 

Ac = b, 


where 


and 



j dx ■ 


1 


* 4- j + 1 


/, j = 0 , 1 , 2 , . . . , n — 1 


b,= 


ri 

xf(x) dx , 
o 


i = 0, 1, 2, — 1. 


Note. The A tj are the elements of the Hilbert matrix of order n. The determinant 
of this Hilbert matrix is a rapidly decreasing function of n. For n = 5, det A = 
3.7 x 10 _12 and the set of equations Ac = b is becoming ill-conditioned and 
unstable. 


9 . 4.4 In place of the expansion of a function F(x) given by 

00 

E(x) = £ a„<p„(x), 

n-0 

with 


rt 

a n ~ 

J a 


F(x)<p n (x)w(x)dx , 


take the finite series approximation 


m 


F(X) * Y C «<Pn(x\ 

rt = 0 


Show that the mean square error 



f(x) - £ C n<Pn(x) 


w(x)dx 


is minimized by taking c n — a n . 

Note. The values of the coefficients are independent of the number of terms in 
the finite series. This independence is a consequence of orthogonality and would 
not hold for a least-squares fit using powers of x. 


9 . 4.5 


From Example 9.2.2 


/(*) = 



0 < x <7t) _2h y sin(2 n + l)x 
— 7r < x < 0 j 7i ^ 2n + 1 



(2 n + 1 )“ 2 . 


(a) Show that 
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For a finite upper limit this would be Bessel’s inequality. For the upper limit, 
oo, as shown, this is ParsevaFs identity. 

(b) Verify that 


7T Ah 2 20 

V=™-£(2n+l)- 

2 n JTn 


by evaluating the series. 

Hint. The series can be expressed as a Riemann zeta function. 


9.4.6 Differentiate Eq. 9.78 : 

<*/#> = </|/> + + *<g\f> + U\g\g> 

with respect to /* and show that you get the Schwarz inequality, Eq. 9.77. 


9.4.7 Derive the Schwarz inequality from the identity 


f f{x)g{x)dx = 

. J * 


lf(x)] 2 dx 


[g(x)] 2 dx 


_1 f» 

2 i . 


U( x )g(y) - f(y)g(x)] 2 dxdy. 


9.4.8 If the functions j\x) and g{x) of the Schwarz inequality, Eq. 9.77, may be expanded 

in a series of eigenfunctions <p,(x ), show that Eq. 9.77 reduces to Eq. 9.75 (with 
n possibly infinite). 

Note the description of f(x) as a vector in a function space in which (p,(x) corre- 
sponds to the unit vector e,. 


9.4.9 The operator H is Hermitian and positive definite, that is, 

rb 


f*Hfdx > 0. 
Prove the generalized Schwarz inequality : 


'b 

Ja 


f*Hgdx 


f*HJdx f g*Hgdx. 
J(l 


9.4.10 (a) The Dirac delta function representation given by Eq. 9.83 

oo 

d(x - t) = £ (p„(x)(p„(t ) 

M = 0 

is often called the closure relation. For an orthonormal set of functions, 
cp„, show that closure implies completeness, that is, Eq. 9.61 follows from 
Eq. 9.83. 

Hint. One can take 


F(*) = 


F(t)5(x — t)dt. 


(b) Following the hint of part (a) you encounter the integral j F(t)<p n (t)dt. How 
do you know that this integral is finite? 


9.4.1 1 For the finite interval ( — n, n) expand the Dirac delta function <5(x — t) in a series 
of sines and cosines: sin nx, cos nx, n = 0, 1, 2, .... Note that although these 
functions are orthogonal, they are not normalized to unity. 

9.4.12 Substitute Eq. 9.90, the eigenfunction expansion of Green’s function, into Eq. 
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9.91 and then show that Eq. 9.91 is indeed a solution of the non homogeneous 
Helmholtz equation (9.85). 


9.4.13 (a) Starting with a one-dimensional nonhomogeneous differential equation, 
(Eq. 9.92), assume that i //(x) and p(x) may be represented by eigenfunction 
expansions. Without any use of the Dirac delta function or its representa- 
tions, show that 




£ $aP(t)(P n (t)dt ^ ^ 

n = 0 ~ k 


Note that (1) if p — 0, no solution exists unless k = k„ and (2) if k — k„, 
no solution exists unless p is orthogonal to <p„. This same behavior will 
reappear with integral equations in Section 16.4. 

(b) Interchanging summation and integration, show that you have constructed 
the Green’s function corresponding to Eq. 9.93. 


9.4.14 The eigenfunctions of the Schrodinger equation are often complex. In this case 
the orthogonality integral, Eq. 9.39, is replaced by 

J (p*{x)<pj{x)w(x)dx = S u . 

Instead of Eq. 9.83, we have 


<5( r i - r 2 ) = 


OO 


X <P„(ri)(p*(r 2 )- 

rt — 0 


Show that the Green’s function, Eq. 9.90, becomes 


G{v u t 2 ) 


y 

n=0 kn ~ k 2 

G*(r 2 , ri ). 


9.4.15 A normalized wave function i^(x) = ^* =0 a„(p„(x). The expansion coefficients 
a n are known as probability amplitudes. We may define a density matrix p with 
elements p u — a t af. Show that 

(p% = pa 
or 


p 2 = p. 

This result, by definition, makes p a projection operator. 
Hint. 


9.4.16 Show that 

(a) the operator 


operating on 


I 


\l/*\j/dx ~ 1. 


|<PiWX<p,(t)| 


f(t) = X<j|<p/(0> 

j 


yields 
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(b) XkiWX'PfWl = 

This operator is a projection operator projecting f(x) onto the ith co- 
ordinate, selectively picking out the ith component c,|<p f (x)> of f(x). 

Hint . The operator operates via the defined inner product. 
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THE GAMMA FUNCTION (FACTORIAL FUNCTION) 

The gamma function appears occasionally in physical problems such as the 
normalization of Coulomb wave functions and the computation of probabilities 
in statistical mechanics. In general, however, it has less direct physical applica- 
tion and interpretation than, say, the Legendre and Bessel functions of Chapters 
11 and 12. Rather, its importance stems from its usefulness in developing other 
functions that have direct physical application. The gamma function, therefore, 
is included here. A discussion of the numerical evaluation of the gamma function 
appears in Section 10.3. 


10.1 DEFINITIONS, SIMPLE PROPERTIES 

At least three different, convenient definitions of the gamma function are in 
common use. Our first task is to state these definitions, to develop some simple, 
direct consequences, and to show the equivalence of the three forms. 


Infinite Limit (Euler) 

The first definition, named after Euler is 


1 - 2-3 • ■ * n 

Y( Z ) = hm — — - 77 7-— rr 

n^oo Z(Z + 1 )(Z + 2) • • • 


(z + n) 


zj= 0 ,- 1 , - 2 , 


This definition of T(z) is useful in developing the Weierstrass infinite-product 
form of T(z) and Eq. 10.16 and in obtaining the derivative of lnT(z) (Section 
10.2). Here and elsewhere in this chapter z may be either real or complex. 
Replacing z with z + 1, we have 


r(z + 


(z 4- n + 1) 


lim nZ 1 * 2 • 3 • • • n ^ 

n~>co z + n + 1 z(z + l)(z + 2) • • • (z 4- n) 


= zT(z). 


539 
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This is the basic functional relation for the gamma function. It should be noted 
that it is a difference equation. It has been shown that the gamma function is 
one of a general class of functions that do not satisfy any differential equation 
with rational coefficients. Specifically, the gamma function is one of the very 
few functions of mathematical physics that does not satisfy either the hyper- 
geometric differential equation (Section 13.5) or the confluent hypergeometric 
equation (Section 13.6). 

Also, from the definition 


T(l) = lim 

fl-'-OO 


1-2-3 • • • n 
1 - 2-3 • • • n{n + 1)” 


= 1. 

Now, application of Eq. 10.2 gives 
F(2) = 1, 
r(3) = 2r(2) = 2 , 

r(n)= 1-2-3 •••(« — 1) = (n — 1)! 


(10.3) 


(10.4) 


Definite Integral (Euler) 

A second definition, also frequently called Euler’s form, is 

f*CO 

T(z ) ~ J e^t z ~ l dt , ^(z) > 0. (10.5) 

The restriction on z is necessary to avoid divergence of the integral. When the 
gamma function does appear in physical problems, it is often in this form or 
some variation such as 

I* oo 

r (z) = 2 e~ ,2 t 2 2-1 dt, .#(z) > 0, (10.6) 

Jo 

2 — 1 

dt , »(z) > 0. (10.7) 

When z = Eq. 10.6 is just the Gauss error function, and we have the interesting 
result 

r® = jk. (io.8) 

Generalizations of Eq. 10.6, the Gaussian integrals, are considered in Exercise 
10.1.11. This definite integral form of T(z), Eq-. 10.5, leads to the beta function, 
Section 10.4. 

To show the equivalence of these two definitions, Eqs. 10.1 and 10.5, consider 
the function of two variables 




F(z, n) = 


t z 1 dt , @l{z) > 0, 


(10.9) 
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with n a positive integer. 1 Since 


lim 1 - - = e'\ 

1 n 


from the definition of the exponential 


lim F(z,n) = F(z, oo ) = e x t z 1 dt 

Jo 

= r(z) 


( 10 . 10 ) 


( 10 . 11 ) 


by Eq. 10.5. 

Returning to F(z, n\ we evaluate it in successive integrations by parts. For 
convenience let u = t/n. Then 


F(z,n) = n % 

Integrating by parts, we obtain 

^ = a-«r- 


(1 — u) n u z du. 


H- - (1 — u) n 1 u z du. 

z Jo 


( 10 . 12 ) 


(10.13) 


Repeating this with the integrated part vanishing at both end points each time, 
we finally get 

n 


F(z, n ) = n : 


n(n — 1) * * * 1 


z(z + 1) • • • (z + n - 1) J 0 
1-2-3 • • • n 


u z+n ~ l du 


(10.14) 


~n . 


z(z + l)(z + 2) • • • (z + n) 

This is identical with the expression on the right side of Eq. 10.1. Hence 

lim F(z,n) = F(z, oo) = T(z). (10.15) 


by Eq. 10.1, completing the proof. 


Infinite Product (Weierstrass) 

The third definition (Weierstrass’s form) is 

(1016) 

where y is the usual Euler- Mascheroni constant, 

y = 0.577 216. ... (10.17) 

This infinite-product form may be used to develop the reflection identity, 
Eq. 10.23, and applied in the exercises, such as Exercise 10.1.19. This form can 
be derived from the original definition (Eq. 10.1) by rewriting it as 


x The form of F(z,n ) is suggested by the beta function (compare Eq. 10.60). 
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Inverting and using 


. .. 1*2*3 ■ ■ ■ n 

T (z) = lim — — -n 

n^co z(z + 1) • • • (z + n) 


n -z = e (-ln«)z 


(10.18) 


(10.19) 


we obtain 


1 


F(z) 

Multiplying and dividing by 
exp 


= z lim e <_lnn,z 

n~*oo 


n i + 


m 


|7i + i 

+ i+- 

.. + iyi 

Lv 2 

3 

nj J 


= n 


r/m 


(10.20) 


( 10 . 21 ) 


we get 


r (z) 


= z < lim exp 


1+ H + 


H In n z 

n 


lim fl 1 + — )e~ z/m 


( 10 . 22 ) 


As shown in Section 5.2, the infinite series in the exponent converges and defines 
y, the Euler-Mascheroni constant. Hence Eq. 10.16 follows. 

It was shown in Section 5.11 that the Weierstrass infinite-product definition 
of T(z) led directly to an important identity, 


r(z)r(i - z) = -A-. (io.23) 

w v 7 Sinz7T 

This identity may also be derived by contour integration (Example 7.2.5 and 
Exercises 7.2.18 and 7.2.19) and the beta function, Section 10.4. Setting z = \ 
in Eq. 10.23, we obtain 

F(i) = Jtz (10.24) 

(taking the positive square root) in agreement with Eq. 10.8. 

The Weierstrass definition shows immediately that T(z) has simple poles at 
z = 0, -1, -2, —3, . . . , and that [JXz)]' 1 has no poles in the finite complex 
plane, which means that T(z) has no zeros. This behavior may also be seen in 
Eq. 10.23, in which we note that 7t/(sin nz) is never equal to zero. 

Actually the infinite-product definition of T(z) may be derived from the 
Weierstrass factorization theorem with the specification that [r(z)] _1 have 
simple zeros at z = 0, —1,2, — 3, .... The Euler-Mascheroni constant is fixed 
by requiring T(l) = 1. 

In probability theory the gamma distribution (probability density) is given 
by 
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fix) = 


1 


rr(a) 
0, 


x <*-i e - x /p 


x > 0 
x < 0. 


(10.24a) 


The constant [jS a T(a)]~ 1 is chosen so that the total (integrated) probability 
will be unity. For x -» £, kinetic energy, a -► § and j8 -> kT , Eq. 10.24a yields 
the classical Maxwell-Boltzmann statistics. 


Factorial Notation 

So far this discussion has been presented in terms of the classical notation. As 
pointed out by Jeffreys and others, the — 1 of the z — 1 exponent in our second 
definition (Eq. 10.5) is a continual nuisance. Accordingly, Eq. 10.5 is rewritten as 

I e~ x t z dt = z !, 9t(z)> -U (10.25) 

Jo 

to define a factorial function z !. Occasionally we may still encounter Gauss’s 
notation, II(z), for the factorial function 

( ia26 ) 

The T notation is due to Legendre. The factorial function of Eq. 10.25 is, of 
course, related to the gamma function by 

r(z) = (z — 1) ! 


or 


T(z + 1) = z ! 

If z = ft, a positive integer (Eq. 10.4) shows that 

z! = w! = l- 2- 3** 


(10.27) 


(10.28) 


the familiar factorial. However, it should be noted carefully that since z ! is now 
defined by Eq. 10.25 (or equivalently by Eq. 10.27) the factorial Junction is no 


x\ 



FIG. 10.1 The factorial function — 
extension to negative arguments 
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FIG. 10.2 The factorial function and the first two derivatives of ln(.v!) 


longer limited to positive integral values of the argument (Figure 10.1). The 
difference relation (Eq. 10.2) becomes 


z i 

(2 — 1 )!=-. 

z 


(10.29) 


This shows immediately that 

0! = 1 (10.30) 

and 


n ! = ± oo for n, a negative integer. 
In terms of the factorial function Eq. 10.23 becomes 

nz 


z!( — z)\ = 


sin nz 


(10.31) 


(10.32) 


By restricting ourselves to the real values of the argument, we find that x ! 
defines the curve shown in Fig. 10.2. The minimum of the curve is 


x ! = (0.461,63 •••)! = 0.885,60 


(10.33) 


Double Factorial Notation 

In many problems of mathematical physics, particularly in connection with 
Legendre polynomials (Chapter 12), we encounter products of the odd positive 
integers and products of the even positive integers. For convenience these are 
given special labels : double factorials. 




DEFINITIONS, SIMPLE PROPERTIES 545 


1 • 3 • 5 • • • (2n + 1) = (2n + 1) ! ! 

2 • 4 • 6 • • • (2n) = (2n) ! ! 

Clearly, these are related to the regular factorial functions by 


( 2 n ) ! ! = 2"n ! 


and 


(2n + 1) ! ! = 


(2n + 1)! 
2"n ! 


(10.33b) 


(10.33c) 


y 




Cut line 

7 * ~ 


4- oo 


FIG. 10.4 (Bottom) The contour of Fig. 10.3 deformed 


Integral Representation 

An integral representation that is useful in developing asymptotic series for 
the Bessel functions is 


J c~ z z v dz = (e 2niv - l)v !, (10.34) 

where C is the contour shown in Fig. 10.3. This contour integral representation 
is particularly useful when v is not an integer, z = 0 then being a branch point. 
Equation 10.34 may be readily verified for v > — 1 by deforming the contour as 
shown in Fig. 10.4. The integral from oo into the origin yields — (v !), placing the 
phase of z at 0. The integral out to oo (in the fourth quadrant) then yields e 2niv v !, 
the phase of z having increased to 2n. Since the circle around the origin contri- 
butes nothing when v > — 1, Eq. 10.34 follows. 

It is often convenient to throw this result into a more symmetrical form 
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j e~ z (—z) v dz = 2i sin vn v ! (10.35) 

This corresponds to choosing the phase of 2 to have a range of — n to -b n in 
Eq. 10.34. 

This analysis establishes Eqs. 10.34 and 10.35 for v > — 1. It is relatively 
simple to extend the range to include all nonintegral v. First, we note that the 
integral exists for v < — 1 as long as we stay away from the origin. Second, inte- 
grating by parts we find that Eq. 10.35 yields the familiar difference relation 
(Eq. 10.29). If we take the difference relation to define the factorial function of 
v < — 1, then Eqs. 10.34 and 10.35 are verified for all v (except negative integers). 


EXERCISES 


10.1.1 


10.1.2 


Derive the recurrence relations 

V(z + 1) = zT(z) 

from the Euler integral form (Eq. 10.5), 

T{z) = J e^r -1 dt. 

In a power-series solution for the Legendre functions of the second kind we 
encounter the expression 

(n + 1 )(n -f 2 )(n + 3) •••(/? + 2s - 1 )(k + 2s) 

2*4*6*8 • • • (2s - 2)(2s)-(2w + 3)(2n + 5)(2n + 7) • • • (2n + 2s + 1)’ 

in which s is a positive integer. Rewrite this expression in terms of factorials. 


10 . 1.3 Show that 


(s~n)\ = (-i) n - s (2n-2s)\ 

(2s — 2 n ) ! (n — s ) ! 

Here 5 and n are integers with s < n. This result can be used to avoid negative 
factorials such as in the series representations of the spherical Neumann func- 
tions and the Legendre functions of the second kind. 


10 . 1.4 


Show that T(z) may be written 

f OD 

e~ t 2 t 2z ~ x du 



du 


0t\z) > 0, 
M(z) > 0. 


10 . 1 .5 In a Maxwellian distribution the fraction of particles between the speed v and 
v 4- dv is 


— = 4n (— YYX — ^ exp( — mv 2 /2k T) v 2 dv , 

N \2nkTJ ‘ 

N being the total number of particles. The average or expectation value of v" is 
defined as <T"> = 7V~ 1 J v n dN. Show that 
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10 . 1.6 By transforming the integral into a gamma function, show that 


xUnxc/x = 


1 


(k + l ) 2 


k > -1. 


10 . 1.7 Show that 


10 . 1.8 Show that 


e x * dx = (£) ! 


lim (ax-l)! = l 
x^o (x — 1 ) ! a 

10.1 .9 Locate the poles of T(z). Show that they are simple poles and determine the 
residues. 

1 0.1 .1 0 Show that the equation x! = k, k =£ 0, has an infinite number of real roots. 

10.1.11 Show that 


(a) 

(b) 


x 2s+1 exp( — ax 2 )dx = ^y. 


x 2 *exp( — ax 2 )dx ■ 


2 a s+llz 

(2s- 1)11 fn 


2 s+1 a s \ a 


These Gaussian integrals are of major importance in statistical mechanics. 

1 0.1 .1 2 (a) Develop recurrence relations for (2 n ) ! ! and for (2 n + 1) ! !. 

(b) Use these recurrence relations to calculate (or to define) 0! ! and (— 1)! !. 

ANS. 0! ! = 1 
(-!)!! = !. 


10 . 1.13 For 5 a nonnegative integer, show that 


(-25 - 1 )!! = 


(-l) s 

(2s- 1)!! 


( — 1 ) v 2 v s ! 

(2s)! 


1 0.1 .1 4 Express the coefficient of the nth term of the expansion of (1 + x) 1/2 

(a) in terms of factorials of integers. 

(b) in terms of the double factorial (! !) functions. 


ANS . 


(-1 r 


(2n - 3)! 
n ~ 2 n\(n - 2)\ 


= (-!)" 


, i (2n — 3) ! ! 
(2 «)!! ’ 


n = 2, 3, 4, 


10 . 1.15 


Express the coefficient of the nth term of the expansion of (1 + x)^ 1/2 

(a) in terms of the factorials of integers. 

(b) in terms of the double factorial (! !) functions. 


ANS. 




(2n)l 
2 2n (n\) 2 


(-iy 


(2n - 1)!! 
(2n) ! ! ’ 


n= 1,2, 3, .... 


1 0.1 .1 6 The Legendre polynomial may be written as 
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P„{ cos 0) = 2 


+ 


(2 n - 1 )!! 


cos nO + 


1 


( 2 «) ! ! 

1 • 3 n(n — 1) 


1 *2 (2n — 1)(2« — 3) 


1 2n - 1 
cos(n — 4)0 


cos(n — 2)0 


+ 


1-3-5 n(n-l)(n-2) 

1-2-3 (2 n- l)(2n — 3) (2ft — 5) 


cos(« — 6)0 + • ■ • 


For n we let n = 2s + 1. Then P„(cos0) = P 2s+1 (cos0) = £;„ =0 u,„cos(2m -1- 1)0. 
Find a m in terms of factorials and double factorials. 


10.1.17 (a) 

(b) 


Show that 


r(i-n)r(± + *) = (-!)■", 


where n is an integer. 

Express T(£ + n) and T(^ — n) separately in terms of tt 1/2 and a ! ! function. 


ANS. 


F(i + n) = 


(2 w — 1)! ! ia 

2 " 


1 0.1 .1 8 From one of the definitions of the factorial or gamma function, show that 

nx 


|(/x)!| 2 =- 


sinh nx 


10.1.19 Prove that 


ir(« +W |-nEMn 


1 + 


P 2 


(a + n) 2 


- 1/2 


a «=oL 

This equation has been useful in calculations in the theory of beta decay. 
10.1.20 Show that 


for n, a positive integer. 

10.1.21 Show that 

|x!| > |(x + iy ) ! | 

for all x. The variables x and y are real. 

10.1.22 Show that 


|(-2 + '»!| 2 


n 

cosh ny 


1 0.1 .23 The probability density associated with the normal distribution of statistics is 
given by 

fix) = ^) F 2 e X p [-(x - n) 2 /2a 2 ] 

with ( — oo, oo ) for the range of x. Show that 

(a) the mean value of x, <x> is equal to /r. 

(b) the standard deviation (<x 2 > — <x> 2 ) 1/2 is given by cr. 


1 0.1 .24 From the gamma distribution of Eq. 10.33a 
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1 x a ~ 1 e x ^ 

fix) = r r(a) 

(o, 

show that 

(a) <x> (mean) = otfl. 

(b) o 2 (variance) = <x 2 > — <x> 2 = a/? 2 . 


x > 0 
x < 0, 


1 0 . 1 .25 The wave function of a particle scattered by a pure Coulomb potential is i/>(r, 0 ). 
At the origin the wave function becomes 

tfr(0) = e~ ny(2 T{\ + iy\ 

where y = Z 1 Z 2 e 1 lhv. Show that 

^( 0 )^( 0 ) = ^^. 


1 0 . 1 .26 Derive the contour integral representation 


2 i sin V7r v ! 


f e z ( — z) v dz. 


1 0.1 .27 Write a function subprogram FACT(N) (fixed point independent variable) that 
will calculate N !. Include provision for rejection and appropriate error message 
if N is negative. 

Note. For small N direct multiplication is simplest. For large N, if large N are 
considered, Eq. 10.55, Stirling’s series would be appropriate. 

10.1.28 (a) Write a function subprogram to calculate the double factorial ratio 

(2 N — 1)! !/(2iV) ! !. Include provision for N = 0 and for rejection and an 
error message if N is negative. Calculate and tabulate this ratio for N = 
1(1)100. 

(b) Check your function subprogram calculation of 199 ! !/200 ! ! against the 
value obtained from Stirling’s series (Section 10.3). 

199 m 

ANS. = 0.056348 

200 !! 

10.1.29 Using either the Fortran supplied GAMMA or a library supplied subroutine 
for x ! or T(x), determine the value of x for which T(x) is a minimum (1 < x < 2) 
and this minimum value of T(x). Notice that although the minimum value of 
T(x) may be obtained to about six significant figures (single precision), the 
corresponding value of x is much less accurate. Why this relatively low accuracy? 

10.1.30 The factorial function expressed in integral form can be evaluated by the 
Gauss-Laguerre quadrature. For a 10-point formula Appendix 2 guarantees 
the resultant x! theoretically exact for x an integer, 0 up through 19. What 
happens if x is not an integer? Use the Gauss-Laguerre quadrature to evaluate 
x!, x = 0.0(0. 1)2.0. Tabulate the absolute error as a function of x. 

Check value. x! exa ct — * -quadrature— 0.00034 for X = 1.3. 


10*2 DIGAMMA AND POLYGAMMA FUNCTIONS 

Digamma Functions 

As may be noted from the three definitions in Section 10.1, it is inconvenient 
to deal with the derivatives of the gamma or factorial function directly. Instead, 
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it is customary to take the natural logarithm of the factorial function, (Eq. 10.1), 
convert the product to a sum, and then differentiate, that is, 


z! = zT(z) = lim 


n ! 


(z + l)(z -f 2) • • • (z 4- n) 


and 


ln(z !) — lim [ln(n !) + zln n — ln(z + 1) 


- ln(z + 2) 


ln(z + «)], 


(10.36) 


(10.37) 


in which the logarithm of the limit is equal to the limit of the logarithm. Differen- 
tiating with respect to z, we obtain 


dz 


ln(z!) — F(z) = lim ( In/t 


1 


z+1 


+ 2 


J 

z + n 


, (10.38) 


which defines F(z), the digamma function. From the definition of the Euler- 
Mascheroni constant 1 Eq. 10.38 may be rewritten as 


F(z)= -y- Z 


l 


= -y + Z 


l 

z -f n n 
z 


(10.39) 


n(n + z) 


One application of Eq. 10.39 is in the derivation of the series form of the 
Neumann function (Section 11.3). Clearly, 

F(0)= -y= -0.577 215 664901 * ■•. 2 (10.40) 

Another, perhaps more useful, expression for F(z) is derived in Section 10.3. 

Polygamma Function 

The digamma function may be differentiated repeatedly, giving rise to the 
polygamma function: 


F <m> (z)^Srln(z!) 


dz 


= (— 1 ) m+1 w! Z 


1 


(10.41) 


(z + «r 


m = 1, 2, 3, .... 


A plot of F(x) and F'(x) is included in Fig. 10.1. Since the series in Eq. 10.41 
defines the Riemann zeta function 3 (with z = 0), 


Compare Sections 5.2 and 5.6, We add and subtract £" =1 s' 1 . 

2 y has been computed to 1271 places by D. E. Knuth, Math. Comp. 16 , 275 
(1962) and to 3566 decimal places by D. W. Sweeney, Math. Comp. 17, 170 
(1963). It may be of interest that the fraction 228/395 gives y accurate to six 
places. 

3 Section 5.9. For z 4 0 this series may be used to define a generalized zeta 
function. 
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(10.42) 

M=1 H 

we have 

F (m) (0) - (-1 ) m+1 rnl C(m + 1), m =1,2,3, .... (10.43) 

The values of the polygamma functions of positive integral argument, F (m) (rc), 
may be calculated by using Exercise 10.2.6. 

In terms of the perhaps more common T notation, 

£^ln r(z) = (10.44a) 


From Eq. 10.27 


ij/ (n \z) = F (n) (z - 1). 


(10.446) 


Maclaurin Expansion, Computation 

It is now possible to write a Maclaurin expansion for In (z !). 


go 

ln(z !) = X — F <n_1) (0) 

„ = i n ! 

(10.44c 

oo „n 

= -yz + £ (-1 )"— C(w) 

n = 2 n 


convergent for |z| < l;forz = x, the range is — 1 < x < 1. Alternate forms of this 
series appear in Exercise 5.9.14. Equation 10.44c is a possible means of comput- 
ing z ! for real or complex z, but Stirling’s series (Section 10.3) is usually better, 
and in addition, is an excellent table of values of the gamma function for complex 
arguments based on the use of Stirling’s series and the recurrence relation (Eq. 
10.29) is now available. 4 


Series Summation 

The digamma and polygamma functions may also be used in summing series. 
If the general term of the series has the form of a rational fraction (with the 
highest power of the index in the numerator at least two less than the highest 
power of the index in the denominator), it may be transformed by the method of 
partial fractions'(compare Section 15.8). The infinite series may then be expressed 
as a finite sum of digamma and polygamma functions. The usefulness of this 
method depends on the availability of tables of digamma and polygamma func- 
tions. Such tables and examples of series summation are given in AMS-55, 
Chapter 6. 


EXAMPLE 10.2.1 Catalan’s Constant 

Catalan’s constant, Exercise 5.2.22, or f}( 2) of Section 5.9 is given by 


4 Table of the Gamma Function for Complex Arguments , National Bureau of 
Standards, Applied Mathematics Series No. 34. 
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Grouping the positive and negative terms separately and starting with unit 
index (to match the form of F (1) , Eq. 10.41), we obtain 


K= 1 + 


0° J 

„?! (4 n + l) 2 


1 

9 


■i 

n=l 


1 

(4 n + 3) 2 ' 


Now, quoting Eqs. 10.41 and 10.44 b, we get 

= 9 + 16^ (1, (1 + i) ~ ^ (1) (1 + |). 

Using the values of from Table 6.1 of AMS-55, we obtain 


(10.44e) 


K = 0.9159 6559.... 


Compare this calculation of Catalan’s constant with the calculations of Chapter 
5, either direct summation by machine or a modification using Riemann zeta 
functions and then a (shorter) machine computation. 


EXERCISES 


10.2.1 


10.2.2 


10.2.3 


10.2.4 


10.2.5 


Verify that the following two forms of the digamma function, 


and 


F(x) = X - - 7 


FM=I-— t-V, 


r(r + x ) 

are equal to each other (for x a positive integer). 

Show that F (z) has the series expansion 

F(*)= -?+ £ (-i ran)!"-'. 

n — 2 

For a power series expansion of ln(z!) AMS-55 lists 


ln(z!) = — ln(l + z) + z(l - y) + £ (-l)"[{(n) - l]z n /n. 

n — 2 

(a) Show that this agrees with Eq. 10.44c for |z| < 1. 

(b) What is the range of convergence of this new expression ? 

Show that 


In 


sin 7TZ 


= f C(2 n) 2n 

2n 


\z\ < 1 . 


Hint. Try Eq. 10.32. 

Write out a Weierstrass infinite product definition of ln(z !). Without differentiat- 
ing, show that this leads directly to the Maclaurin expansion of ln(z !), Eq. 10.44c. 
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10.2.6 

10.2.7 

10.2.8 

10.2.9 

10.2.10 

10 . 2.11 

10.2.12 

10.2.13 


Derive the difference relation for the polygamma function 

F <m) (z + 1) = F«“>(Z) + (- rn = 0, 1, 2, ... . 

(z 4- l) m 

Show that if 


then 


T(x 4- iy) ~ u 4- iv 


T(x — iy) = u — iv. 

This is a special case of the Schwarz reflection principle, Section 6.5. 

The Pochhammer symbol ( a) n is defined as 

(a) n - a(a ~f 1) • • * (a 4 n — 1) 

(a) o = 1 

(for integral n). 

(a) Express ( a) n in terms of factorials. 

(b) Find ( d/da)(a) n in terms of (a)„ and digamma functions. 

ANS. (a)„ = (a)„[F( fl + n — 1) — F (a — 1)]. 
da 

(c) Show that 

(a)„+k = (« + »)* •(«)„■ 

Verify the following special values of the ^ form of the di- and polygamma 
functions 


<A( i)= -y 

= « 2 ) 

^< 2 >(1)= -2C(3). 

Derive the polygamma function recurrence relation 

«A <m) (l + z) = <//""(z) + (- l) m m!/z ra+1 , m = 0, 1, 2, ... . 


Verify 




e 'Inrdr = —y. 
re~' In r dr = 1 — y. 


0 


/»oo 

(c) Y n e~ r \nr dv — {n — 1) ! + /i j r"" 1 e~ r ln rdr, n~ 1,2,3,.... 

Jo Jo 

Hint. These may be verified by integration by parts, three parts, or differentiating 
the integral form of n ! with respect to n . 


Dirat relativistic wave functions for hydrogen involve factors such as [2(1 — 
a 2 Z 2 ) 1/2 ] ! where a, the fine structure constant, is and Z is the atomic number. 
Expand [2(1 — a 2 Z 2 ) 1/2 ] ! in a series of powers of a 2 Z z . 

The quantum mechanical description of a particle in a coulomb field requires 
a knowledge of the phase of the complex factorial function. Determine the phase 
of (1 4- ib ) ! for small b. 
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10.2.14 


10.2.15 


10.2.16 


10.2.17 


10.2.18 


10.2.19 


10.2.20 


The total energy radiated by a black body is given by 

Snk 4 T 4 r °° 


c 3 h 3 


i- r™ v -3 
. — — — dx. 

Jo ~ 1 


Show that the integral in this expression is equal to 3! ((4). [£(4) = ti 4 /90 - 
1.0823. . . .] The final result is the Stefan-Boltzmann law. 

As a generalization of the result in Exercise 10.2.14, show that 
x s dx 


f 


e x - 1 


= s! £(s T 1), #(s) > 0. 


The neutrino energy density (Fermi distribution) in the early history of the 
universe is given by 

v-3 


471 

Pv ~¥ 


exp(x/kT) + 1 


dx. 


Show that 


Prove that 


Py = 


1 n 5 

m- 


(kT)\ 


x s dx 

7+i 


= s!(l -2 ~ S K(S+ 1), <X(s) > 0. 


Exercise 10.2.15 and 10.2.17 actually constitute Mellin integral transforms 
(compare Section 15.1). 


Prove that 


r°° t n P~ zx 

\j/ in) (z) = (- 1)" +1 dt, M(z) > 0. 

Jo 1 - e 


Using di- and poly gamma functions sum the series 

(a) 


(b) 


CO I 

I — - — 

n =i n(n + 1) 


1 


1 


« = 2« 

Note. You can use Exercise 10.2.6 to calculate the needed digamma functions. 
Show that 

£ 1 1 


(n + a)(n + b) (b - a) 

1 


{m - F(d)} 

{^(1 + b)~ »A(1 + a)}. 


(b-aV 

a ^ b, and neither a nor b is a negative integer. It is of some interest to compare 
this summation with the corresponding integral 

dx 1 


(x + a){x + b) b — a 


{ln(l +- b) — ln(l +- a)}. 
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The relation between i/dx) (or F(x)) and lnx is made explicit in Eq. 10.51 in the 
next section. 


1 0 . 2.21 Verify the contour integral representation of f(s). 




(-s)l 

2ni 


e z - 1 

Jc 


dz. 


The contour C is the same as that for Eq. 10.35. The points z = ±2nni, n = 
1, 2, 3 ... are all excluded. 


10 . 2.22 Show that f(s) is analytic in the entire finite complex plane except at s = 1 
where it has a simple pole with a residue of + 1. 

Hint. The contour integral representation will be useful. 

10 . 2.23 Using the complex variable capability of FORTRAN IV calculate 0t{\ + ib )\ , 

1 + ib)\, |(1 4- ib)\ | and phase (1 + ib)\ for b = 0.0(0.1)1.0. Plot the phase of 
(1 + ib) ! versus b. 

Hint. Exercise 10.23 offers a convenient approach. You will need to calculate 
t(n). 


10.3 STIRLING'S SERIES 


For computation of ln(z!) for very large z (statistical mechanics) and for 
numerical computations at nonintegral values of z a series expansion of ln(z!) 
in negative powers of z is desirable. Perhaps the most elegant way of deriving 
such an expansion is by the method of steepest descents (Section 7.4). The 
following method, starting with a numerical integration formula, does not 
require knowledge of contour integration and is particularly direct. 


Derivation from Euler-Maclaurin Integration Formula 

The Euler-Maclaurin formula for evaluating a definite integral 1 is 


n f(x)dx = i/(0) + /( 1) + /(2) + • • • + yin) 
o 

-b 2 lf'(n) -/'(0)] - b A U"\n) 0)] - 


(10.45) 


in which the b 2n are related to the Bernoulli numbers B 2n (compare Section 5.9) 
by 



(2 n)\b 2n = B 2n , 

(10.46) 

B 0 = 1, 

= 


B 2 =i 

JO 

\m 

\ 

II 

(10.47) 


# 4 = “so* Bio = 66 , and so on. 
By applying Eq. 10.45 to the definite integral, we have 



dx 

(z + x) 2 


1 

z 


(10.48) 


Obtained by repeated integration by parts. Section 5.9. 
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(for 2 not on the negative real axis), we obtain 


1 

z 


1 

2z 2 


+ F (1) (z) — 




(10.49) 


This is the reason for using Eq. 10.48. The Euler-Maclaurin evaluation yields 
F (1) (z), which is d 2 In (z \)/dz 2 . 

Using Eq. 10.46 and solving for F (1) (z), we have 


F (1 )(z) = ^F(z) = i 

dz z 


1 

2z 2 

1 

2z 2 


Bj , Ba 


+ ^i + 


+ 


+ z 


b 7 


= i Z 


(10.50) 


Since the Bernoulli numbers diverge strongly, this series does not converge ! It 
is a semiconvergent or asymptotic series, useful for computation despite its 
divergence (compare Section 5.10). 

Integrating once, we get the digamma function 


F(z) = C 1 + lnz + — ■ 
2 z 


2 z 2 


1 00 

= Cj +lnz + — - £ 


B 


A 

4z 4 

'2 n 


2z 2nz 2 


(10.51) 


Integrating Eq. 10.51 with respect to z from z — 1 to z and then letting z ap- 
proach infinity, C l9 the constant of integration may be shown to vanish. This 
gives us a second expression for the digamma function, often more useful than 
Eq. 10.38. 


Stirling's Series 

The indefinite integral of the digamma function (Eq. 10.51) is 

lnM = C i + ( Z + l)lnz-z + §+... + 00.52) 

in which C 2 is another constant of integration. To fix C 2 we find it convenient to 
use the doubling or Legendre duplication formula derived in Section 10.4, 

z ! (z — ^) ! = 2~ 2z 7i 1/2 (2z) ! (10.53) 

This may be proved directly when z is a positive integer by writing (2z) ! as a 
product of even terms times a product of odd terms and extracting a factor of 
two from each term (Exercise 10.3.5). Substituting Eq. 10.52 into the logarithm 
of the doubling formula, we find that C 2 is 

C 2 = 2 In 271, (10.54) 


giving 

ln(z!) = |ln 2k + (z +| 


l_ 

12z 


1 

360z 3 


+ 1260z 5 


In z — z + 


(10.55) 
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FIG. 10.5 Accuracy of Stirling’s formula 


This is Stirling’s series, an asymptotic expansion. The absolute value of the error 
is less than the absolute value of the first term neglected. 

The constants of integration C A and C 2 may also be evaluated by comparison 
with the first term of the series expansion obtained by the method of “steepest 
desceat.” This is carried out in Section 7.4. 

To help convey a feeling of the remarkable precision of Stirling’s series for s ! 
the ratio of the first term of Stirling’s approximation to s ! is plotted in Fig. 10.5. 
A tabulation gives the ratio of the first term in the expansion to s ! and the ratio 
of the first two terms in the expansion to s ! (Table 10.2). The derivation of these 
forms is Exercise 10.3.1. 


TABLE 10.2 


s 

^/2iis‘ +vi e~ s 

Jlns' +lll e~ s [\ + yjj] 

s\ 

s! 

1 

0.92213 

0.99898 

2 

0.95950 

0.99949 

3 

0.97270 

0.99972 

4 

0.97942 

0.99983 

5 

0.98349 

0.99988 

6 

0.98621 

0.99992 

7 

0.98817 

0.99994 

8 

0.98964 

0.99995 

9 

0.99078 

0.99996 

10 

0.99170 

0.99998 
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Numerical Computation 

The possibility of using the Maclaurin expansion* Eq. 10.44c, for the numeri- 
cal evaluation of the factorial function is mentioned in Section 10.2. However, 
for large x, Stirling’s series, Eq. 10.55, gives much more rapid convergence. The 
Table of the Gamma Function for Complex Arguments , National Bureau of 
Standards, Applied Mathematics Series No. 34, is based on the use of Stirling’s 
series for z = x + iy, 9 < x < 10. Lower values of x are reached with the recur- 
rence relation, Eq. 10.29. Now suppose the numerical value of x ! is needed for 
some particular value of x in a program in a large, high-speed digital computer. 
How shall we instruct the computer to compute x !? Stirling’s series followed by 
the recurrence relation is a good possibility. An even better possibility is to fit 
x!, 0 < x < 1, by a short power series (polynomial) and then calculate x! 
directly from this empirical fit. Presumably, the computing machine has been 
told the values of the coefficients of the polynomial. Such polynomial fits 
have been made by Hastings 2 for various accuracy requirements. For example, 

8 

x! = 1 + Y K xlt + £(*), (10.56a) 

n = 1 

with 


b x = -0.577191652 
b 2 - 0.988205891 

& 3 = -0.89705 6937 
b 4 = 0.918206857 


b 5 = -0.756704078 
b 6 = 0.48219 9394 

b 7 = -0.19352 7818 
b 8 = 0.03586 8343 


(10.56/?) 


with the magnitude of the error |e(x)| <3 x 10' 7 , 0 < x < 1. 

This is not a least-squares fit. Hastings employed a Chebyshev polynomial 
technique similar to that described in Section 13.4 to minimize the maximum 
value of |g(x)|. 


EXERCISES 


10 . 3.1 


Rewrite Stirling’s series to give z! instead of ln(z!). 
ANS. z\ = j2nz z+1/2 e 


1 1 

12z 288z 2 


139 

51,840z 3 



10 . 3.2 Use Stirling’s formula to estimate 52!, the number of possible rearrangements 
of cards in a standard deck of playing cards. 


10 . 3.3 By integrating Eq. 10.51 from z — 1 to z and then letting z -► 00, evaluate the 
constant C 1 in the asymptotic series for the digamma function F(z). 

10 . 3.4 Show that the constant C 2 in Stirling’s formula equals ^ln27r by using the 
logarithm of the doubling formula. 

1 0 . 3.5 By direct expansion verify the doubling formula for z = n + n is an integer. 


2 C. Hastings, Jr., Approximations for Digitial Computers. Princeton, NJ; 
Princeton University, Press (1955). 
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10.3.6 


Without using Stirling’s series show that 


(a) ln(n!)< 


'*n+ 1 

1 


In xdx, 


(b) 


ln(n!) > 


f In xdx; 


n is an integer > 2. 


Notice that the arithmetic mean of these two integrals gives a good approxima- ' 
tion for Stirling’s series. 


10.3.7 


Test for convergence 



x = . y (2p-l)!!(2p + l)H 

2/7 4-2 p % (2p ) ! \(2p 4- 2) ! ! 


This series arises in an attempt to describe the magnetic field created by and 
enclosed by a current loop. 


10.3.8 Show that 


10.3.9 Show that 


lim x l 

x~>ao 


■-« (* + a) l 

( x 4- b ) ! 


= 1. 


lim 

n~M» 


(2 n ~ !)• • i/2 _ -i/2 

(2n ) ! ! 


10.3.10 


Calculate the binomial coefficient 



to six significant figures for n = 10, 20, 


and 30. Check your values by 

(a) a Stirling series approximation through terms in n~ \ 

(b) a double precision calculation. 



1.84756 x 10 5 
1.37846 x 10 11 
1.18264 x 10 17 . 


10.3.11 Write a program (or subprogram) that will calculate log 10 (x!) directly from 
Stirling’s series. Assume that x > 10. (Smaller values could be calculated via 
the factorial recurrence relation.) Tabulate log 10 (x!) versus x for x = 10(10)300. 
Check your results against AMS-55 or by direct multiplication (for n = 10, 20, 
and 30). 

Check value. log 10 (100!) = 157.97. 


10.3.12 Using the complex capability of FORTRAN IV, write a subroutine that will 
calculate ln(z!) for complex z based on Stirling’s series. Include a test and an 
appropriate error message if z is too close to a negative real integer. Check your 
subroutine against alternate calculations for z real, z pure imaginary, and 
z = 1 4- ib (Exercise 10.2.23). 

Check values. |(i*0.5)!| = 0.82618 

phase (/0.5)! = —0.24406. 



560 THE GAMMA FUNCTION (FACTORIAL FUNCTION) 



FIG. 10.6 Transformation from car- 
tesian to polar coordinates 


1 0.4 THE BETA FUNCTION 


Using the integral definition (Eq. 10.25), we write the product of two factorials 
as the product of two integrals. To facilitate a change in variables, we take the 
integrals over a finite range. 


mini = lim 

a 2 -* oo 


e V v n dv , 


> — 1 , 
yA\n) > - 1 . 


f e~ u u m du 
1 Jo 

Replacing u with x 2 and v with y 2 , we obtain 

mini = lim 4 f e~ x2 x 2m+l dx | e~ y2 y 2n+l dy. 


n/2 


(10.57a) 


(10.576) 


Transforming to polar coordinates gives us 


mini — lim 4 


-r* r 2m+2n + 3 dy cos 2m+l 0 S in 2 " +1 0 d() 


n/2 


(m + n + 1) ! 2 I cos 2w+1 dsin 2 " +1 ()d(). 


(10.58) 


Here the cartesian area element dx dy has been replaced by r dr dO (Fig. 10.6). The 
last equality in Eq. 10.58 follows from Exercise 10.1.11. 

The definite integral, together with the factor 2, has been named the beta 
function 


n/2 


B(m + l,n + 1) = 2 | cos 2m+1 flsin 2n+1 OdO 

= B(n + \,m + 1). 


m In ! 


(10.59a) 


(m + n + 1)! 
Equivalently, in terms of the gamma function 

r(p)r( 9 ) 


B(p,q) = 


F(p + q) 


(10.59b) 
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The only reason for choosing m -f 1 and n + 1, rather than m and n , as the argu- 
ments of B is to be in agreement with the conventional, historical beta function. 

In this manipulation the transformation from cartesian to polar coordinates 
needs some justification. As seen in Fig. 10.6, the shaded area is being neglected. 
However, the maximum value of the integrand in this region is £>~ f,2 a 2m+ 2,1+3 
which vanishes so strongly as a approaches infinity that the integral over the 
neglected region vanishes. 


Definite Integrals, Alternate Forms 

The beta function is useful in the evaluation of a wide variety of definite 
integrals. The substitution t — cos 2 6 converts Eq. 10.59a to 1 

B(m + l,n + l) = - — '? !w! — = I t m ( 1 - tf dt. (10.60a) 

(m + n- hi)! J 0 

Replacing t by x 2 , we obtain 


m In ! 


= x 
Jo 


'(1 - x 2 fdx . 


2 (m 4- n + 1) 1 

The substitution t = u/( 1 + u ) in Eq. 10.60a yields still another useful form, 


(10.606) 


mini 


(m + n + 1)! J 0 (1 + u )' 


n+n+2 


du. 


(10.61) 


The beta function as a definite integral is useful in establishing integral repre- 
sentations of the Bessel function (Exercise 11.1.18) and the hypergeometric 
function (Exercise 13.5.7). 


Verification of 7rcr/sin 7ra Relation 

If we take m = a, n = —a, — 1 <a< 1, then 

r u° 

■ — - — 2 - du = al(~a)l (10.62) 

Jo(l +«) 2 

By contour integration this integral may be shown to be equal to na/sinna 
(Exercise 7.2.18), thus providing another method of obtaining Eq. 10.32. 


Derivation of Legendre Duplication Formula 

The form of Eq. 10.59 suggests that the beta function may be useful in deriving 
the doubling formula used in the preceding section. From Eq. 10.60a with 
m = n = z and M(z) > — 1, 


(2 z+ 1)! 


P 1 

t z ( 1 — tf dt. 
Jo 


(10.63) 


1 The Laplace transform convolution theorem provides an alternate derivation 

of Eq. 10.60n, compare Exercise 15.11.2. 
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By substituting t = (1 + s)/ 2, we have 


z \z ! 

( 27 + 1 )! 


(1 — s 2 ) z ds 



s 2 y ds. 


(10.64) 


The last equality holds because the integrand is even. Evaluating this integral as 
a beta function (Eq. 10.60/?), we obtain 


= 2 -2z-i z! (-2)! 

(22 + 1)! (z + i)!' 


(10.65) 


Rearranging terms and recalling that ( — 2 )! = rt I/2 , we quickly reduce these 
equations to one form of the Legendre duplication formula, 

z\(z + i)! = 2~ 2z ~ x n m {2z + 1)!. (10.66a) 


Dividing by (z + ^), we obtain an alternate form of the duplication formula. 

z\(z - i)! = 2~ 2z 7t 1/2 (2z)!. (10.66/?) 

Although the integrals used in this derivation are defined only for $(z) > — 1, 
the results (Eqs. 10 .66a and 10.6 6b) hold for all z by analytic continuation. 2 

Using the double factorial notation (Section 10.1), we may rewrite Eq. 10.66a 
(with z = n, an integer) as 

(n + £)! = n 1/2 (2n + l)!!/2" +1 . (10.66c) 


This is often convenient for eliminating factorials of fractions. 


Incomplete Beta Function 

Just as there is an incomplete gamma function (Section 10.5), there is also 
an incomplete beta function, 


B x {p, q ) 




0 < x < 1 

p > 0 


(10.67) 


q > 0 (if x = 1). 


Clearly, B x=1 (p,q) becomes the regular (complete) beta function, Eq. 10.60. A 
power-series expansion of B x (p,q) is the subject of Exercises 5.2.18 and 5.7.8. 
The relation to hypergeometric functions appears in Section 13.5. 

The incomplete beta function makes an appearance in probability theory in 
calculating the probability of at most k successes in n independent trials. 3 


2 If 2 z is a negative integer, we get the valid but unilluminating result oo = oo . 

3 W. Feller, An Introduction to Probability Theory and Its Applications. 3rd ed.. 
Section VI. 10. New York: Wiley (1968). 
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EXERCISES 

1 0.4.1 Derive the doubling formula for the factorial function by integrating (sin 2 6) 2n+l 
= (2sin$cos0) 2 " +1 (and using the beta function). 

1 0.4.2 Verify the following beta function identities: 

(a) B(a,b) = B(a + 1 ,b) + B(a,b + 1), 

(b) B(a,b) = a ^-B(a,b + 1), 

b 

(c) B(a,b) = B(a + 1 ,b - 1), 

a 

(d) B(a, b)B(a + b,c) ~ B(b , c)B(a, b + c). 


10 . 4.3 (a) Show that 


(1 — x 2 ) ll2 x 2n dx 


(b) Show that 


(1 — x 2 ) 1,2 x 2n dx = 


n/2 

(In - 1)!! 
% (In + 2)!!’ 


, (2n — 1)!! 

■w 


n= 1, 2, 3, ... . 


n= 1, 2, 3, ... . 


10 . 4.4 Show that 


(1 - x 2 ) n dx = 


2^n+l 

| (2n 4- 1)!’ 

L {2n)\\ 

[ (2n +!)!!’ 


n > — 1 


n = 0, 1, 2, 


1 0-4.5 Evaluate JI t (1 + x) Q (l — x) b dx in terms of the beta function. 

ANS. 2 ,l+b+1 B(a + 1,6 + 1). 

1 0.4.6 Show, by means of the beta function, that 

dx n n t 

— ? 0 < a < 1. 

J f (z — x) a (x — tf sm tux 

This result is used in Section 16.2 to solve Abel’s generalized integral equation. 


10 . 4.7 Show that the Dirichlet integral 


x p y q dA = 


p\ql B(p + l,q + 1) 


(P + q + 2)! p + q + 2 


where the range of integration is the triangle bounded by the positive x- and 
y-axes and the line x + y = 1. 


10 . 4.8 Show that 


e ~(x 2 +y 2 +2xycos6) fody 


What are the limits on 0? 

Hint. Consider oblique xy coordinates. 


ANS. —7t<0<n. 
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10 . 4.9 

10 . 4.10 

10 . 4.11 

10 . 4.12 

10 . 4.13 

10 . 4.14 

10 . 4.15 


Evaluate (using the beta function) 


(a) 

(b) 


I 

r 


s 112 6d6- 


cos "Odd = 


(2tt)' 


3/2 


16 [(i)!] 2 ’ 

sin n Qd6 = 
% — i)ii 
n!! 

7i (n — 1)! ! 

I 2 * nil 


- l)/2]! 
2(n/2)! 
for « odd, 


for n even. 


Evaluate Jo (1 — x 4 ) 1/2 dx as a beta function. 


ANS. = 1.311028777. 

(2n) 1/2 


Given 




W2 


n 1/2 (v — ^)! \2 
:>eta fu 

J v (z)= £(-l) 


sin 2v 0cos(zcos0)d0, ^(v)>— 


show, with the aid of beta functions, that this reduces to the Bessel series 

J /„\2 s+v 

^-7: 

s=0 


s!(s + v)! \2J 

identifying the initial J v as an integral representation of the Bessel function, 
J v (Section 11.1). 


Given that the associated Legendre polynomial P™(x) ~ (2m — 1) ! !(1 — x 2 ) m/2 , 
Section 12.5, show that 


(a) 

(b) 


P [P,:(x)] 2 dx= - 2 -~ (2m)!, 

J-i ( 2 m + 1 ) 

f 1 dx 

[CW] 2 ^ = 2-(2m - 1)!, 

J-l 1 X 


m = 0, 1, 2, ... . 
m = 1, 2, 3, 


Show that 


(a) 

(b) 


r 

r 


(x 2 ) s+1/2 (l —x 2 ) 1/2 dx = 


(x 2 m 


■ x 2 ) q dx = --^S- 


( 25 )!! 

( 25 + 1 )!!’ 




2 (p + q + i)'- 


A particle of mass m moving in a symmetric potential that is well described by 
V(x) = A\x\ n has a total energy \m(dxjdt) 2 + L(x) = £. Solving for dx/dt and 
integrating, we find that the period of motion is 


, C x r 

■ = 2 yf2m 

Jo 


dx 


(E-Ax n ) 112 ’ 

where x max is a classical turning point given by Ax^ ax = E. Show that 
t _ 2 j 2jtm/E\ ln T(\!n) 


E \A) r(i/ii + i)' 


Referring to Exercise 10.4. 14, 

(a) Determine the limit as n -> oo of 
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2 jlrtm / £ V /n r(l//i) 
nyj E \AJ T(1 /n + i)' 

(b) Find lim t from the behavior of the integrand, (£ — Ax n Y m . 

(c) Investigate the behavior of the physical system (potential well) as n -► oo. 
Obtain the period from inspection of this limiting physical system. 


10.4.16 Show that 

sinh’x ^ = 1 B ( y .+ A ^-_z\ 

J 0 cosh^x 2 2 2 J 

Hint. Let sinh 2 x — u. 


- 1 < a < ft. 


1 0.4.1 7 The beta distribution of probability theory has a probability density 

fix) = : 


i ^ £Xa±i) v «-i(i _ x y~ l 


mw) 

with x restricted to the interval (0, 1). Show that 
a 


(a) <x>(mean) = - 

QC + fl 

(b) <r 2 (variance) = <x 2 > — <x> 2 = 




(a + P) 2 (a + p-\) 


10.4.18 From 


lim 

M ~+00 


M2 

sin 2n OdO 

Jo j 


M2 

sin 2 " +1 

Jo 


OdO 


derive the Wallis formula for n: 


k 2*2 4*46*6 

2 ~ 1*3 # 3*5*5*7* 


10.4.1 9 Tabulate the beta function B(p,q) for p and q = 1.0(0. 1)2.0, independently. 
Check value. B( 1.3, 1.7) = 0.40774. 

1 0.4.20 (a) Write a subroutine that will calculate the incomplete beta function B x (p , q). 

For 0.5 < x < 1 you will find it convenient to use the relation 

B x (p,q) = B{p,q) - B^ x (q,p). 

(b) Tabulate B x (%, §). Spot check your results by using the Gauss-Legendre 
quadrature. 


10.5 THE INCOMPLETE GAMMA FUNCTIONS AND 
RELATED FUNCTIONS 

Generalizing the Euler definition of the gamma function (Eq. 10.5), we define 
the incomplete gamma functions by the variable limit integrals 

y(a,x) = J e~ t t a ~ 1 dt , $(a) > 0 

Jo 
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and 


T(a,x) = 


e t t a 1 dt. 


Clearly, the two functions are related, for 


y(a,x) -I- T(a,x) = T{a). 


( 10 . 68 ) 


(10.69) 


The choice of employing y(a,x) or T(a,x) is purely a matter of convenience. 
If the parameter a is a positive integer, Eqs. 10.68 may be integrated completely 
to yield 


y(n,x) = (n - 1)! fl - e x £ 


(10.70) 


r(«,x) = (n- l)!e-*"f »= 1,2,.... 

,= o .s ! 

For nonintegral a a power-series expansion of y(a,x) for small x and an 
asymptotic expansion of T(a,x) are developed in Sections 5.7 and 5.10. 


y(a,x) = x a 


°o v n 

I ( — 1)” T7~Z V 

,to n\(a + n) 




(10.71) 


These incomplete gamma functions may also be expressed quite elegantly in 
terms of confluent hypergeometric functions (compare Section 13.6). 


Exponential Integral 

Although the incomplete gamma function T(«,x) in its general form (Eq. 
(10.68) is only infrequently encountered in physical problems, a special case is 
quite common and very useful. We define the exponential integral by 1 


— Ei( — x) 


f*GO 


Jx 



dt = £i(x). 


(10.72) 


(See Fig. 10.7). To obtain a series expansion for small x, we proceed as follows. 
Then 


E l (x) - T(0,x) 

= lim [T(a) — y(a,x)]. 

a ->0 


(10.73) 


Caution is needed here, for the integral in Eq. 10.72 diverges logarithmically as 


1 The appearance of the two minus signs in — Ei{ — x) is an historical monstros- 
ity. AMS-55 denotes this integral as E ^(x). 
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Ei(x) 



x -► 0. We may split the divergent term in the series expansion for y(a, x). 


£ x (x) = lim 

a~* 0 


Using l’Hospital’s rule (Exercise 5.6.9) and 


al» 


- z 


(-lyv 


n-n\ 


^-{ar(a)} = 4-a\ = = a! F(a), 

da da da 


and then Eq. 10. 40, 2 we obtain 


Ei(x) = ~y - lnx - £ 


(-l)"*" 


“i «•«! 


(10.74) 


(10.74a) 


(10.75) 


useful for small x. An asymptotic expansion is given in Section 5.10. 

Further special forms related to the exponential integral are the sine integral, 
cosine integral (Fig. 10.8), and logarithmic integral defined by 3 


si(x) — 
Ci(x) = 
li(x) = 


sin t 
t 

' cos t 


dt 

dt 


(10.76) 


du 
In u 


~ £/(ln x). 


By transforming from real to imaginary argument, we can show that 

si(x) = ~j\_Ei(ix) — Ei( — ix)] = —[E^ix) — E x (~ ix)], (10.77) 


whereas 


2 dx a !da = x a ln 

3 Another sine integral is given by Si{x) = si(x ) + re/2. 
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Ci(x) = ^\_Ei(ix) + Ei( — ix)] = —-^E^ix) + E x ( — ix)], |argx| < 7 ~. (10.78) 

Adding these two relations, we obtain 

Ei(ix) = Ci(x) + isi(x\ (10.79) 

to show that the relation among these integrals is exactly analogous to that 
among e lx , cosx, and sinx. In terms of E x 

E^ix) — —Ci(x) + isi(x). 

Asymptotic expansions of Ci(x) and si(x) are developed in Section 5.10. 
Power-series expansions about the origin for Ci(x), si(x\ and li(x) may be 
obtained from those for the exponential integral, E^x), or by direct integration, 
Exercise 10.5.10. The exponential, sine, and cosine integrals are tabulated in 
AMS-55, Chapter 5. 


Error Integrals 

The error integrals 



0 


dt , 


erfc z = 1 — erf z = 


2 

Jn 


e~' 2 dt 


(10.80a) 


(normalized so that erfoo — 1) are introduced in Section 5.10 (Fig. 10.9). 
Asymptotic forms are developed there. From the general form of the integrands 
and Eq. 10.6 we expect that erf z and erfc z may be written as incomplete gamma 
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functions with a = \. The relations are 

erfz = n~ 1/2 y(j,z 2 ) 
erf cz = 7r _1/2 r(^,z 2 ). 

The power-series expansion of erfz follows directly from Eq. 10.71. 


(10.806) 


EXERCISES 


10.5.1 


Show that 


(a) 

(b) 


00 


y(a,x) = e x £ 

n~ 0 


(a + n ) ! 


By repeatedly integrating by parts. 

Demonstrate this relation by transforming it into Eq. 10.71. 


10.5.2 Show that 
d m 

(a) — [x~“y(a,x)] = (-l) m x a m y (a + m,x), 

d m rta) 

( b ) r^- eX y( a ' *)] = — \y (a ~ m - x )- 

dx r (a — m) 


1 0.5.3 Show that y(a , x) and F(a, x) satisfy the recurrence relations 

(a) y(a + l,x) = ay(a,x ) — x a e~ x , 

(b) T{a + l,x) = aT(a,x) + x a e" x . 


1 0.5.4 The potential produced by a Is hydrogen electron is (Exercise 12.8.6) given by 


V(r) 

(a) For r <<c 1 show that 


D = ^-f 


4ns 0 a 0 (2r 


y(3, 2r) + T(2, 2r)l. 






(b) For r » 1 show that 


4ns 0 a 0 


V(T) = 


1 - -r 2 + • ■ 
3 


q i 


4ns 0 a 0 r 


Here r is a pure number, the number of Bohr radii, a 0 . 

Note. For computation at intermediate values of r, Eqs. 10.70 are convenient. 


10.5.5 


The potential of a 2 p hydrogen electron is found to be (Exercise 12.8.7) 
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10.5.8 


m- 


i 

47ce 0 24 a 0 [r 

1 


-y(5,r)+ T(4,r) 


y(l, r) + r 2 T(2, r) l P 2 (cos 0). 


4n£ 0 120 a 0 (r 

Here r is expressed in units of a 0 , the Bohr radius. P 2 ( cos ^) ls a Legendre 
polynomial Section 12.1). 

(a) For r « 1, show that 


(b) For r » 1, show that 


1 .9 f 1 


1 


47te 0 «o (4 120 


r 2 P 2 (cosO) + * ■ 


V(r) = 


10.5.6 Prove that the exponential integral 


_6_ 

„2 J 


dt. = — y — In x 


I 

n = l 


(-l)"x n 


y is the Euler- Mascheroni constant 
1 0.5.7 Show that E y (z) may be written as 

pec -zt 

‘■"-'’I rn* 


Show also that we must impose the condition |argz| < n/2. 

Related to the exponential integral (Eq. 10.72) by a simple change of variable is 
the function 


5.M-J -jr*- 

Show that £„(x) satisfies the recurrence relation 

n— 1, 2, 3, .... 


£„ t ,(v) = -e X ~-E n (x), 
n n 


1 0.5.9 With £„(x) defined in Exercise 10.5.8, show that £„(0) = \/{n — 1), n > 1. 

1 0.5.1 0 Develop the following power-series expansions 

. . ... n , £ (-iyx 2 " +1 

(a) si(x) = h ) — 1 . 

2 R = 0 (2« + 1)(2« + 1) ! 


°° f_n« Y 2n 

(b) Ci(x) = y + lnx + £ 

„ =t 2n(2n ) ! 

1 0.5.1 1 An analysis of a center -fed linear antenna leads to the expression 

r x 1 - cos t, 



Jo 1 

Show that this is equal to 
y + In x — Cz(x). 
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r 


FIG. 10.10 Distributed charge potential produced by a IS hydrogen electron. 
Exercise 10.5.14. 

10.5.12 Using the relation 

r(a) = y(a,x) + T(u,x), 

show that if y(a,x) satisfies the relations of Exercise 10.5.2, then r(a,x) must 
satisfy the same relations. 

10.5.13 (a) Write a subroutine that will calculate the incomplete gamma functions: 

y(n,x) and T(n,x) for n a positive integer. Spot check Tpi, x) by Gauss- 
Laguerre quadratures — Appendix 2. 

(b) Tabulate y(n,x) and T(n,x) for x = 0.0(0. 1)1.0 and n— 1, 2, and 3. 

10.5.14 Calculate the potential produced by a Is hydrogen electron (Exercise 10.5.4) 
(Fig. 10.10). Tabulate F(r)/(<z/47r£ 0 <2 0 ) for x = 0.0(0.1)4.0. Check your calcula- 
tions for r <sc 1 and for r y> 1 by calculating the limiting forms given in Exercise 
10.5.4. 

10.5.1 5 Using Eqs. 5.204 and 10.75, calculate the exponential integral £,(x) for 

(a) x = 0.2(0.2)1.0, 

(b) x = 6.0(2.0)10.0. 

Program your own calculation but check each value, using a library subroutine 
if available. Also check your calculations at each point by a Gauss-Laguerre 
quadrature. 

You should find that the power-series converges rapidly and yields high 
precision for small x. The asymptotic series, even for x = 10, yields relatively 
poor accuracy. Check values . £,(1.0) = 0.219384 

£,(10.0) = 4.15697 x 10“ 6 . 

1 0.5.1 6 The two expressions for £,(x), (1) Eq. 5.204, an asymptotic series and (2) Eq. 

10.75, a convergent power series, provide a means of calculating the Euler- 
Mascheroni constant y to high accuracy. Using double precision, calculate y 
from Eq. 10.75 with £,(x) evaluated by Eq. 5.204. 

Hint. As a convenient choice take x in the range 10 to 20, (Your choice of x 
will set a limit on the accuracy of your result.) To minimize errors in the alternat- 
ing series of Eq. 10.75, accumulate the positive and negative terms separately. 

ANS. For x = 10 and “double precision” y — 0.5772 1566. 
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FUNCTIONS 


11.1 BESSEL FUNCTIONS OF THE FIRST KIND, JJx) 

Bessel functions appear in a wide variety of physical problems. In Section 2.6 
separation of the Helmholtz or wave equation in circular cylindrical coordinates 
led to Bessel’s equation. In Section 11.7 we will see that the Helmholtz equation 
in spherical polar coordinates also leads to a form of Bessel’s equation. Bessel 
functions may also appear in integral form — integral representations. This may 
result from integral transforms (Chapter 15) or from the mathematical elegance 
of starting the study of Bessel functions with Hankel functions, Section 11.4. 

Bessel functions and closely related functions form a rich area of mathe- 
matical analysis with many representations, many interesting and useful 
properties, and many interrelations. Some of the major interrelations developed 
in Section 11.1 and in succeeding sections are outlined in Fig. 11.1. Note that 
Bessel functions are not restricted to Chapter 11. The asymptotic forms are 
developed in Section 7.4 as well as in Section 1 1.6. The confluent hypergeometric 
representations appear in Section 13.6. 

Generating Function, Integral Order, J n (x) 

Although Bessel functions are of interest primarily as solutions of differential 
equations, it is instructive and convenient to develop them from a completely 
different approach, that of the generating function. 1 This approach also has the 
advantage of focusing on the functions themselves rather than on the differential 
equations they satisfy. An outline of the development of Bessel and related 
functions from the generating function is shown in Fig. 11.1. Let us introduce 
a function of two variables, 

g(x 9 t) = e {xl2)it ~ llt \ (11.1) 

Expanding this function in a Laurent series (Section 6.5), we obtain 

00 

e (x/2)(t-l/t) _ £ J„(x)t", (11.2) 


Generating functions have already been used in Chapter 5. In Section 5.6 
the generating function (l+.x)" generated the binomial coefficients. In 
Section 5.9 the generating function — 1) 1 generated the Bernoulli 
numbers. 
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FIG. 1 1.1 Bessel function interrelations 
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The coefficient of t n , J„(x ), is defined to be a Bessel function of the first kind of 
integral order n. Expanding the exponentials, we have a product of Maclaurin 
series in xt/2 and — x/2t , respectively. 


e xtl2 . e ~x/2t = £ 


x\ v 


2 r! 


I (-i) s 


s~ 0 


XV t 


2 S'! 


For a given s we get t” (n > 0) from r = n + s 


-l T 


,2) (n + s) ! 
The coefficient of t n is then 2 


xX'C 


2 s! 


(11.3) 


(11-4) 


■fn(x) = L 


(-ir 

%s'.(n + s)\ 


n+2s 


2 n n\ 2 n 2 (n + 1)! 


+ * 


(11.5) 


This series form exhibits the behavior of the Bessel function J n (x) for small x 
and permits numerical evaluation of J n (x). The results for «/ 0 , J x , and J 2 are 
shown in Fig. 11.2. From Section 5.3 the error in using only a finite number of 
terms in numerical evaluation is less than the first term omitted. For instance, 
if we want J n (x) to ±1% accuracy, the first term alone of Eq. 11.5 will suffice, 
provided the ratio of the second term to the first is less than 1% (in magnitude) 
or x < 0.2 (n + 1) 1/2 . The Bessel functions oscillate but are not periodic — except 
in the limit as x -► oo (Section 11.6). The amplitude of J n (x) is not constant but 
decreases asymptotically as x~ 1/2 . 

Equation 11.5 actually holds for n < 0, also giving 




£ (- 1 1 

g hsl(s-n)l 



(11-6) 


which amounts to replacing n by — n in Eq. 11.5. Since n is an integer (here), 
(s — n) \ oo for s = 0, . . . , (n — 1). Hence the series may be considered to start 


2 From the steps leading to this series and from its convergence characteristics 
it should be clear that this series may be used with x replaced by z and with z 
any point in the finite complex plane. 
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with s = n. Replacing s by s 4- n, we obtain 


oo 


J-nM = Z 

5 = 0 


(-l) s+ " 

s ! (n + s) ! 



(11.7) 


showing immediately that J n (x) and J~ n (x ) are not independent but are related 

by 


= (- 1 ) n J„(x), (integral n). (11.8) 

These series expressions (Eqs. 11.5 and 11.6) may be used with n replaced by v 
to define J v (x) and J_ v (x) for nonintegral v (compare Exercise 11.1.7). 


Recurrence Relations 

The recurrence relations for J„(x) and its derivatives may all be obtained by 
operating on the series, Eq. 11.5, although this requires a bit of clairvoyance 
(or a lot of trial and error). Verification of the known recurrence relations is 
straightforward, Exercise 11.1.7. Here it is convenient to obtain them from the 
generating function, g(x, t). Differentiating Eq. 11.1 partially with respect to t, 
we find that 

M^(jc/2)(f-l/0 

' ' (11.9) 

= Z 

n= — oo 

and substituting Eq. 11.2 for the exponential and equating the coefficients of 
like powers of tf we obtain 

J„_ 1 (x) + J„ +1 W = ^d„(x). (11.10) 

This is a three-term recurrence relation. Given J 0 and J u for example, J 2 (and 
any other integral order J n ) may be computed. 

With the opportunities offered by modern digital computers (and the 
demands they levy), Eq. 11.10 has acquired an interesting new application. 
In computing a numerical value of J N {x 0 ) for a given x 0 , one could use Eq. 1 1.5 
for small x, or the asymptotic form, Eq. 11.144 of Section 11.6 for large x. A 
better way, in terms of accuracy and machine utilization, is to use the recurrence 
relation, Eq. 11.10, and work downf With n » N and n » x 0 , assume 

J n +i(x o) = 0 and J n( x o) = a> 

where a is some small number. Then Eq. 11.10 leads to 7„~i(x 0 ), J n ~ 2 ( x o\ 
and so on, and finally, to J 0 (x 0 ). Since a is arbitrary, the Jf s are all off by a 



3 This depends on the fact that the power-series representation is unique 
(Sections 5.7, 6.5). 

4 1. A. Stegun, M. Abramowitz, “Generation of Bessel functions on high 
speed computers, ” Mathematical Tables and Other Aids for Computation , 11, 
255-257 (1957). 
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common factor. This factor is determined by the condition 


GO 

J 0 (x 0 ) + 2 £ J 2m (x o) = 1. 

m~ 1 


(11.10a) 


(Set t = 1 in Eq. 1 1.2.) The accuracy of this calculation is checked by trying again 
at n' = n + 3. This technique yields the desired J N (x 0 ) and all the lower integral 
index J’s down to J 0 . This is the technique employed by the FORTRAN SSP 
subroutine BESJ. 

High-speed, high-precision numerical computation is more or less an art. 
Modifications and refinements of this and other numerical techniques are being 
proposed year by year. For information on the current “state of the art” the 
student will have to go to the literature, and this means primarily to the journal 
Mathematics of Computation. 

Differentiating Eq. 11.1 partially with respect to x, we have 



= i J'n(x)t n . 

M — — GO 


( 11 . 11 ) 


Again, substituting in Eq. 1 1.2 and equating the coefficients of like powers of t, 
we obtain the result 


Jn - lM - J n + i(x) = 2 J’ n (x). 

As a special case of this general recurrence relation, 

Jo{x) = -J, (x). 

Adding Eqs. 11.10 and 11.12 and dividing by 2, we have 


(11.12) 

(11.13) 


J„~ lW = 'fn(x) + (11-14) 

Multiplying by x n and rearranging terms produces 

^0 n j„(x)] = x n J n _ l (x). (11.15) 

Subtracting Eq. 11.12 from 11.10 and dividing by 2 yields 


J„ +1 (x) = ~J n (x)-J' n (x). (11-16) 

Multiplying by x~ n and rearranging terms, we obtain 

-f [x-V„(x)] = -x^J n+1 (x). (11.17) 


Bessel's Differential Equation 

Suppose we consider a set of functions Z v (x) which satisfies the basic recur- 
rence relations (Eqs. 11.10 and 11.12), but with v not necessarily an integer and 
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Z v not necessarily given by the series (Eq. 1 1.5). Equation 1 1. 14 may be rewritten 
(n — ► v) as 

xZ' v (x) = xZ v _ x (x) - vZ v (x). (11.18) 

On differentiating with respect to x, we have 

xz:(x) + (V + l)z; - xZ;_ A - Z v _ 1 = 0. (11.19) 

Multiplying by x and then subtracting Eq. 11.18 multiplied by v gives us 

x 2 z; + xz; - v 2 z v + (v - i)xz v _! - x 2 z;_! = o. ( 11 . 20 ) 

Now we rewrite Eq. 11.16 and replace n by v — 1. 

xZ;_! = (v - l)Z v _i - xZ v . (11.21) 

Using this to eliminate Z V _ A and Z'_ L from Eq. 11.20, we finally get 

x 2 z; + xz; + (x 2 - v 2 )z v = o. (i 1 . 22 ) 

This is just Bessel’s equation. Hence any functions, Z v (x), that satisfy the recur- 
rence relations (Eqs. 11. 10 and 11.12, 11.14, and 11.16, or 11.15 and 1 1.17) satisfy 
Bessel’s equation; that is, the unknown Z v are Bessel functions. In particular, 
we have shown that the functions J„(x), defined by our generating function, 
satisfy Bessel’s equation. If the argument is kp rather than x, Eq. 11.22 becomes 

p 2 ~Z v (M + p -j- Z v (kp) + ( k 2 p 2 - v 2 )ZJkp) = 0. (11.22a) 


Integral Representation 

A particularly useful and powerful way of treating Bessel functions employs 
integral representations. If we return to the generating function (Eq. 11.2), 
and substitute t — e lQ , 

c ix sm o _ j 0 ( x ) + 2(J 2 (x)cos20 + J 4 (x)cos40 + * * *) 

+ 2i(J 1 (x)sinO + J 3 (x) sin 311 + ■ • •), 
in which we have used the relations 

Mx)e w + J^(x)e' id = J x (x)(e ie - e~ w ) 

= 2/Jj (x) sin 6, (11.24) 

J 2 (x) e2W + J_ 2 (x)e~ 2,e ~ 2J 2 (x)cos20 , 

and so on. 

In summation notation 


00 

cos(xsinO) = J 0 (x) 4- 2 £ J 2n (x) cos(2n()\ 

n = 1 

00 ' 

sin(xsinO) = 2 £ J 2n -. x (x)sin[(2n — 1)0], 

n = 1 


(11.25) 
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equating real and imaginary parts, respectively. It might be noted that angle 0 
(in radians) has no dimensions. Likewise sin 0 has no dimensions and the 
function cos(x sin 6) is perfectly proper from a dimensional point of view. 

By employing the orthogonality properties of cosine and sine, 5 


cos nO cos mOdO = - <L 


sin n6 sin mOdO = - d nm , 


in which n and m are positive integers (zero is excluded), 6 we obtain 


1 P 

- cos(x sin 9) cos nO dO = 

71 Jo 


sin(x sin 9) sin nO dd = 


J n (x), 

0, 

0, 

d n (x). 


n even, 
n odd, 

n even, 
n odd. 


If these two equations are added together, 


(11.26a) 

(11.26ft) 


(11-27) 

(11.28) 


UX) = ~ 

71 


*71 

[cos(x sin 0) cos nd + sin(x sin 0) sin nQ\ dO 
o 


n n 

cos (nd — x sin 0) d6 , 
Jo 


n = 0, 1, 2, 3, ... . 


(11.29) 


As a special case, 


J 0 (x) = ■- | cos(xsin 0)d0. 


(11.30) 


Noting that cos(x sin 0) repeats itself in all four quadrants (0 1 — 0,0 2 = n — 0, 
0 3 = n + 0, 0 4 = — 0\ we may write Eq. 1 1.30 as 

1 C 2n 

J 0 (x) = — cos(xsin 9)d0. (11.30a) 

2n Jo 


On the other hand, sin(x sin 0) reverses its sign in the third and fourth quadrants 
so that 


2tl 



sin (x sin 0)d() = 0. 


(11.305) 


Adding Eq. 11.30a and i times Eq. 11.305, we obtain the complex exponential 
representation 


5 They are eigenfunctions of a self-adjoint equation (linear oscillator equation) 
and satisfy appropriate boundary conditions (compare Sections 9.2 and 
14.1). 

6 Equations 1 1.26a and b hold for either m or n — 0. If both m and n = 0, the 
constant in 1 1.26a becomes 7c; the constant in Eq. 1 1.266 becomes 0. 
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J oM = 


1 

2n 



e ix * in0 dO 


271 


C2n . 

e lxco *°dO. 


(11.30c) 


This integral representation (Eq. 11.29) may be obtained somewhat more 
directly by employing contour integration (compare Exercise 11. 1.16). 7 Many 
other integral representations exist (compare Exercise 11.1.18). 



EXAMPLE 11.1.1 Fraunhofer Diffraction, Circular Aperture 

In the theory of diffraction through a circular aperture we encounter the 
integral 


0> 


ra 


f2n 

e ibrcos0 d()rdr 


(11.31) 


Jo Jo 

for O, the amplitude of the diffracted wave. 8 Here 0 is an azimuth angle in the 


7 For n — 0 a simple integration over 6 from 0 to 2n will convert Eq. 1 1.23 
into Eq. 1 1.30c. 

8 The exponent ibr cos 0 gives the phase of the wave on the distant screen at 
angle a relative to the phase of the wave incident on the aperture at the point 
( r , 6 ). The imaginary exponential form of this integrand means that the 
integral is technically a Fourier transform. Chapter 15. In general, the 
Fraunhofer diffraction pattern is given by the Fourier transform of the 
aperture. 
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TABLE 11.1 Zeros of the Bessel Functions and Their 


First Derivatives 


Number 
of zero 

Ux) 

Ji(x) 

j ■*(*) 

J 3 w 

J 4 {x) 

JsM 

1 

2.4048 

3.8317 

5.1356 

6.3802 

7.5883 

8.7715 

2 

5.5201 

7.0156 

8.4172 

9.7610 

11.0647 

12.3386 

3 

8.6537 

10.1735 

11.6198 

13.0152 

14.3725 

15.7002 

4 

11.7915 

13.3237 

14.7960 

16.2235 

17.6160 

18.9801 

5 

14.9309 

16.4706 

17.9598 

19.4094 

20.8269 

22.2178 


JoW 


J' 2 (x) 




1 

3.8317 

1.8412 

3.0542 

4.2012 



2 

7.0156 

5.3314 

6.7061 

8.0152 



3 

10.1735 

8.5363 

9.9695 

11.3459 




Note. Jq(x) = —J^x). 


plane of the circular aperture of radius a, and a is the angle defined by a point 
on a screen below the circular aperture relative to the normal through the 
center point. The parameter b is given by 


j 2n . 
b — ysino!, 

X 


(11.32) 


with X the wavelength of the incident wave. The other symbols are defined by 
Fig. 11.3. From Eq. 11.30c we get 9 


~ 271 


J 0 (br)rdr. 


(11.33) 


Equation 11.15 enables us to integrate Eq. 11.33 immediately to obtain 


<F 


2nab 


J x {ab) 


Xa 




2na . 

— sin a 

A 


(11.34) 


b"~ sin a 

The intensity of the light in the diffraction pattern is proportional to 4> 2 and 


o2 _ p 1 [(27ca/^)sing] 


sina 


(11.35) 


From Table 11.1, which lists the zeros of the Bessel functions and their first 
derivatives, 10 the expression 11.35 will have a zero at 


~ sin a = 3.8317... 

A 


(11.36) 


or 


9 We could also refer to Exercise 11.1.16(b). 

10 Additional roots of the Bessel functions and their first derivatives may be 
found in C. L. Beattie, “Table of First 700 Zeros of Bessel Functions,” Bell 
Tech. J. 37, 689 (1958) and Bell Monograph 3055. 
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3.8317/1 

sin a = — . 

2na 

For green light X — 5.5 x 10“ 5 cm. Hence, if a — 0.5 cm, 
a « sin a = 6.7 x 10~ 5 (radian) 
« 14 seconds of arc, 


(11.37) 


(11.38) 


which shows that the bending or spreading of the light ray is extremely small. 
If this analysis had been known in the seventeenth century, the arguments 
against the wave theory of light would have collapsed. 

In mid-twentieth century this same diffraction pattern appears in the scatter- 
ing of nuclear particles by atomic nuclei — a striking demonstration of the wave 
properties of the nuclear particles. 

A further example of the use of Bessel functions and their roots is provided by 
the electromagnetic resonant cavity, Example 11.1.2 that follows and the 
example and exercises of Section 11.2. 


EXAMPLE 11.1.2 Cylindrical Resonant Cavity 

In the interior of a resonant cavity electromagnetic waves oscillate with a 
time dependence e ~ l<at . Maxwell’s equations lead to 

V x V x E = a 2 E 

for the space part of the electric field with a 2 = a) 2 G 0 p 0 (Example 1.9.2). With 
V • E — 0 (vacuum, no charges), 

V 2 E + oc 2 E = 0. 

Separating variables in circular cylindrical coordinates (Section 2.4), we find 
that the 2 -component (£ z , space part only) satisfies the scalar Helmholtz 


equation 

V 2 £ z + a 2 E Z = 0, 

(11.39) 

where a 2 

= co 2 s 0 fi o = io z /c 2 . Further, 



(E z ) mnk = X J P)e ±im,,, {a mn sin kz + b m „ cos kz\. 

(11.40) 


The parameter k is a separation constant introduced in splitting off the z 
dependence of E z (p , cp , z). Similarly, m entered in splitting off the cp dependence. 
y enters as a 2 — k 2 and is quantized by the requirement that ya be a root of the 
Bessel function J m (Eq. 1 1.43 which follows). Then the n in y mn designates the nth 
root of J m , 

For the end surfaces at z = Oandz = / (as in Fig. 11.4), let us set = 0,and 

k = f, p = 0,1,2,.... (11.41) 

Maxwell’s equations then guarantee that the tangential electric fields E p and 
£ will vanish at z . = 0 and l This is the transverse magnetic or TM mode of 
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oscillation. We have 


2 co 


CO 


-2 


p 2 n 2 
l 2 ' 


(11.42) 


But there is the usual boundary condition that E z {p = a) — 0. Hence we must 
set 


(11.43) 


where oc mn is the nth zero of J m . 

The result of the two boundary conditions and the separation constant m 2 
is that the angular frequency of our oscillation depends on three discrete 
parameters 


co n 


= c =*? + '- 


1 2 


m = 0, 1,2 * - 
n = 1, 2, 3 • • • 
P = 0,1, 2 ■ • ■ 


(11.44) 


These are the allowable resonant frequencies for our TM mode. The TE mode 
of oscillation is the topic of Exercise 11.1.26. 


Alternate Approaches 

Bessel functions are introduced here by means of a generating function, 
Eq. 11.2. Other approaches are possible. Listing the various possibilities, 



584 BESSEL FUNCTIONS 


we have 

1. Generating function (magic), Eq. 11.2. . 

2. Series solution of Bessel’s differential equation, 

Section 8.5. 

3. Contour integrals: Some writers prefer to start with 
contour integral definitions of the Hankel functions, 

Sections 7.4 and 1 1.4, and develop the Bessel function 
J v (x) from the Hankel functions. 

4. Direct solution of physical problems : Example 11.1.1, 

Fraunhofer diffraction with a circular aperture, illus- 
trates this. Incidentally, Eq. 11.31 can be treated by 
series expansion, if desired. Feynman 11 develops 
Bessel functions from a consideration of cavity 
resonators. 

In case the generating function seems too arbitrary, it can be derived from 
a contour integral, Exercise 11.1.16, or from the Bessel function recurrence 
relations, Exercise 11.1.6. 

Bessel Functions of Nonintegral Order 

These different approaches are not exactly equivalent. The generating 
function approach is very convenient for deriving two recurrence relations, 
Bessel’s differential equation, integral representations, addition theorems 
(Exercise 11.1.2), and upper and lower bounds (Exercise 11.1.1). However, the 
reader will probably have noticed that the generating function defined only 
Bessel functions of integral order, J 0 , J l9 J 2 , and so on. This is a great limitation 
of the generating function approach. But the Bessel function of the first kind 
J v (x), may easily be defined for nonintegral v by using the series (Eq. 11.5) 
as a new definition. 

The recurrence relations may he verified by substituting in the series form 
of J x (x) (Exercise 11.1.7). From these relations Bessel’s equation follows. In fact, 
if v is not an integer, there is actually an important simplification. It is found that 
J v and J_ v are independent, for no relation of the form of Eq. 1 1.8 exists. On the 
other hand, for v = n, an integer, we need another solution. The development 
of this second solution and an investigation of its properties form the subject 
of Section 11.3. 


EXERCISES 

11 . 1.1 From the product of the generating functions g(x , t ) * g{x, — t) show that 
1 = [7 0 (x)] 2 + 2 [AM] 2 + 2 [J 2 (x)Y + ■ • • 


11 R. P. Feynman, R. B. Leighton, and M. Sands, The Feynman Lectures on 
Physics , Vol. II, Chap. 23. Reading, Mass. : Addison-Wesley (1964). 
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11 . 1.2 


and therefore that \J 0 (x)\ < 1 and |J M (x)| < 1/^/2, n— 1, 2, 3, ... . 
Hint. Use uniqueness of power series, Section 5J. 

Using a generating function g(x, t) = g(u + v, t) = g(u, t)*g(v, t), show that 

(a) J„(u + v) = X J s (u ) • 

s= — 00 

(b) J 0 (u + i>) = J 0 {u)J 0 {v) + 2 X 

s— 1 

These are addition theorems for the Bessel functions. 


11.1.3 Using only the generating function 

e (x/2)(t-Ut) __ ^ J n {x)t n 

n= —oo 

and not the explicit series form of •/„(*), show that J n (x) has odd or even parity 
according to whether n is odd or even, that is, 12 

J n (x) = (~\) n J n (-x). 

1 1 .1 .4 Derive the Jacobi- Anger expansion 

CO 

giz cos 0 _ £ i'"JJz)e'"' V . 

m= —oo 

This is an expansion of a plane wave in a series of cylindrical waves. 


11.1.5 Show that 


(a) cos* = J 0 (x) + 2 X ( — l)"J 2n (x), 

n = l 
oo 

(b) sinx = 2 £ (-l)" +1 J 2 „ +1 (x). 

11 = 1 


11.1.6 To help remove the generating function from the realm of magic, show that it 
can be derived from the recurrence relation, Eq. 11.10. 

Hint. 1. Assume a generating function of the form 


11.1.7 


g(x,t)= X 

m — — oo 

2. Multiply Eq. 11.10 by t n and sum over n. 

3. Rewrite the preceding result as 



g{x,t) = 


2t dg(x, t) 
x dt 


4. Integrate and adjust the function of integration (a function of x) so 
that the coefficient of t° is J 0 (x) as given by Eq. 11.5. 

Show, by direct differentiation, that 


AW 


I 


s = 0 


(-l) s /xV +2a . 
s!(s+ v)!\2yl 


satisfies the two recurrence relations 


12 This is easily seen from the series form (Eq. 1 1.5). 
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Jy-l(x) + J V+1 (X) = — J V (X), 

X 

j v -i(x) - J v+ i(x) = 2J'(x). 
and Bessel’s differential equation 

x 2 J"(x) -F xJ',(x) + (x 2 — v 2 )J v (x) = 0. 


11.1.8 Prove that 

sinx 
x 

1 — cosx 


(a) 


(b) 


Cn/2 

*1 J " 

=r 


(x cos Q)zosQdO, 

n/2 


J x (x cos 0)d(l 


Hint. The definite integral 

rn/2 


2*4-6 • • - (2s) 


cos Zs+1 OdO = 

0 1*3*5 (25+ 1) 


may be useful. 
11.1.9 Show that 


Mx) = 


2 f 1 cos xt . 
- —==dt. 

"Jo /i - T 2 


This integral is a Fourier cosine transform (compare Section 15.3). The corre- 
sponding Fourier sine transform, 


J 0 (x) m - 


sm xt 

, 


dt. 


is established in Section 11.4, using a Hankel function integral representation. 
11.1.10 Derive 


J n (x) = (-l)”x" 

Hint. Try mathematical induction. 


d_ 

xdx 


J 0 (x). 


11.1.11 Show that between any two consecutive zeros of J n (x) there is one and only one 
zero of J„ +1 (x). 

Hint . Equations 11.15 and 11.17 may be useful. 


11.1.12 An analysis of antenna radiation patterns for a system with a circular aperture 
involves the equation 


If fif) = 1 — r 2 , show that 


0 («) = 


f(r)J 0 (ur)rdr. 

o 


g(u) = ~ 2 J 2 (u). 


11.1.13 The differential cross section in a nuclear scattering experiment is given by 
d<j/dQ — \ f(0)\ 2 . An approximate treatment leads to 
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11 . 1.14 


11 . 1.15 


11 . 1.16 


m = 


— ik 
2n 


' 2n r/t 

0 Jo 


exp [ikp sin 0 sip <p] p dp d<p . 


Here 6 is an angle through which the scattered particle is scattered. R is the 
nuclear radius. Show that 


d Q n 


J x (kR sin 0) ~\ 2 
sin 0 


A set of functions C„(x) satisfies the recurrence relations 

C.-iM — C„+,(x) = — C„(x), 
x 

C„_,(x) + C„ +1 (x) = 2 C'(x). 

(a) What linear second-order differential equation does the C„(x) satisfy? 

(b) By a change of variable transform your differential equation into Bessel’s 
equation. This suggests that C n (x) may be expressed in terms of Bessel 
functions of transformed argument. 

A particle (mass m) is contained in a right circular cylinder (pillbox) of radius R 
and height H. The particle is described by a wave function satisfying the 
Schrodinger wave equation 


- -r- V 2 <ls(p, <p, z) = Ei//(p, <p,z) 

2m 

and the condition that the wave function go to zero over the surface of the 
pillbox. Find the lowest (zero point) permitted energy. 

ANS. £ = jqy^Y + (- ~ 

2ml\R ) \H 

where z is the #th zero of J p , the index p fixed by the azimuthal dependence. 


E - — 

L mm « 

2m 


2.405 

R 


+ ~ 


m 


(a) Show by direct differentiation and substitution that 


JM) 


-±L 


,(jc/2)(i-l/t)j-v-l fa 


or that the equivalent equation 


1 


JvW = ^V 2 


? s~x 2 I4* „ 


1 ds 


satisfies Bessel’s equation. C is the contour shown in Fig. 11.5. The negative real 
axis is cut line. 

Hint. Show that the total integrand (after substituting in Bessel’s differential 
equation) may be written as a total derivative : 


5t‘' exp 


-OH' 


+ — ( t -f- 


(b) Show that the first. integral (with n an integer) may be transformed into 


1 

C2n 

e Hx sind-nd) rfQ 

” 2n t 

JO 

T ^ 

p2n 

i 

e i(x cos 6+nd) 

2n 

Jo 
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V' 



FIG. 1 1.5 Bessel function contour 


11.1.17 


The contour C in Exercise 11.1.16 is deformed to the path — oo to —1, unit 
circle e~ m to e ln , and finally — 1 to — oo. Show that 


JM) = - 

71 


f*n 

cos (vO — xsin 0)d0 
o 


sin viz 


'CO 

e { ~ 

Jo 


dO . 


This is Bessel’s integral. 

Hint. The negative values of the variable of integration u may be handled by 
using 


u = te ±in . 


11.1.18 (a) 


Show that 


J v (x) = 


n ll2 (v — 2 )! ( 2 ) 


'71/2 

cos(x sin I?) cos 2v 0 dO , 
0 


where v > — 

Hint. Here is a chance to use series expansion and term-by-term integration. 
The formulas of Section 10.4 will prove useful. 

(b) Transform the integral in part (a) into 


Ux) = 



'71 

cos(x cos 0) sin 2v 0 di) 
0 




e ±/*cd.fl sin 2 VQdO 


1 

7T 1/2 (v-i)! 



e ±ipx (l - p 2 y ~ V2 dp. 


These are alternate integral representations of J v (x). 


11.1.19 (a) 


From 



v-l e t~ X 2 l4t dt 


derive the recurrence relation 


= -j,(x) - j v+1 (x). 

x 
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(b) From 


•!»(*) = 

2.711 


t v 1 e&fiw dt 


derive the recurrence relation 

J'Ax) = R-iW - K+i W 

1 1 .1 .20 Show that the recurrence relation 

J'M) = ilA-iM - A+iW] 
follows directly from differentiation of 

J n {x) — - j cos (nO — x sin 0) dO. 
K Jo 

11.1.21 Evaluate 


(* 00 

e ~ ax J 0 (bx)dx, a, b > 0. 

Jo 

Actually the results hold for a > 0, —<X)<b<co. This is a Laplace transform 
ofJ 0 . 

Hint. Either an integral representation of J 0 or a series expansion will be 
helpful. ' 

1 1 .1 .22 Using trigonometric forms, verify that 

"In 

e ibr si nO d0. 

0 


J 0 (br) = 


2n 


11.1.23 (a) Plot the intensity (O 2 of Eq. 1 1.35) as a function of (sin a/A) along a diameter 
of the circular diffraction pattern. Locate the first two minima. 

(b) What fraction of the total light intensity falls within the central maximum? 
Hint. [J^x)] 2 /* may be written as a derivative and the area integral of the 
intensity integrated by inspection. 


11.1.24 


The fraction of light incident on a circular aperture (normal incidence) that is 
transmitted is given by 


r= 2 


(* 2ka dx 1 

o * 


2 ka 


J 2 (x)dx. 


Here a is the radius of the aperture, and k is the wave number, 2k//.. Show that 

1 00 

(a) £J 2 „ + i(2M, 


(b) T= 1- 


2ka 


J 0 (x) dx. 


11.1 .25 The amplitude U(p, cp,t) of a vibrating circular membrane of radius a satisfies 
the wave equation 


V 2 U - 


1 d 2 U 
v 2 dt 2 


= 0. 


Here v is the phase velocity of the wave fixed by the elastic constants and 
whatever damping is imposed. 

(a) Show that a solution is 

U(p,<p,£) = J„,(M(«i^' m<p + + b 2 e~ iu>( ). 
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(b) From the Dirichlet boundary condition, J„,(ka) = 0, find the allowable 
values of the wavelength L (k = 2n//). 

Note. There are other Bessel functions besides J,„ but they all diverge at p = 0. 
This is shown explicitly in Section 11.3. The divergent behavior is actually 
implicit in Eq. 11.6. 


1 1 .1 .26 Example 11.1.2 describes the TM modes of electromagnetic cavity oscillation. 

The transverse electric (TE) modes differ in that we work from the z component 
of the magnetic induction B: 

\ 2 B Z + a 2 B Z = 0 


with boundary conditions 


B z (0) = B.(l) = 0 and 


dB z 

dp 


= 0. 


Show that the TE resonant frequencies are given by 

p= 1,2, 3, .... 


PL , p 2 ^ 2 

2 i 2 * 


1 1 .1 .27 Plot the three lowest TM and the three lowest TE angular resonant frequencies, 
a) mnp , as a function of the radius/length ( a/l ) ratio for 0 < a/l < 1.5. 

Hint. Try plotting co 2 (in units of c 2 /a 2 ) versus (a/l) 2 . Why this choice? 

1 1 .1 .28 A thin conducting disk of radius a carries a charge q. Show that the potential 
is described by 

<p(r,z) = -3— r e ^J 0 (kr) S - U -- a dk, 

4ne 0 a J 0 k 

where J Q is the usual Bessel function and r and z are the familiar cylindrical 
coordinates. 

Note. This is a difficult problem. One approach is through Fourier transforms 
such as Exercise 15.3.11. For a discussion of the physical problem see Jackson 
(Classical Electrodynamics). 


11.1.29 Show that 

x m J n (x)dx , m > n > 0. 

Jo 

(a) is integrable in terms of Bessel functions and powers of x [such as a p J q (a )] 
for m T n odd ; 

(b) may be reduced to integrated terms plus Jj 7 0 (x)dx for m + n even. 


11.1.30 Show that 




— Wj’b’dy = 

«0 n) 


i r° n 


«0„ Jo 


J 0 (y)dy ■ 


Here a 0n is the nth root of J 0 (y). This relation is useful in computation (Exercise 
11.2.11). The expression on the right is easier and quicker to evaluate — and 
much more accurate. Taking the difference of two terms in the expression on 
the left leads to a large relative error. 


11.1.31 Write a program that will compute successive roots of the Bessel function 
J n (x\ that is, a ns , where J n (<x ns ) = 0. Tabulate the first five roots of J 0 , J j, and 
of J 2 . 

Hint. See Appendix 1 for root-finding techniques and recommendations. 

Check value. a i2 = 7.01559. 
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11.1.32 The circular aperature diffraction amplitude <t> of Eq. 17.35 is proportional 
to f(z) = J { (z)/z. The corresponding single slit diffraction amplitude is pro- 
portional to g(z) = sin z/z. 

(a) Calculate and plot f(z) and g(z) for z = 0.0(0.2)12.0. 

(b) Locate the two lowest values of z (z > 0) for which /(z) takes on an extreme 
value. Calculate the corresponding values of /(z). 

(c) Locate the two lowest values of z (z > 0) for which g(z ) takes on an extreme 
value. Calculate the corresponding values of g(z ). 

11.1.33 Calculate the electrostatic potential of a charged disk (p(r,z)/{q/4nc 0 a ) from 
the integral form of Exercise 11.1.28. Calculate the potential for rja = 0.0(0.5)2.0 
and z/a — 0.25(0.25)1.25. Why is z/a — 0 omitted ? Exercise 12.3.17 is a spherical 
harmonic version of this same problem. 

Hint . Try a Gauss-Laguerre quadrature, Appendix 2. 


1 1 .2 ORTHOGONALITY 


If Bessel’s equation, Eq. 11.22a, is divided by x, we see that it becomes self- 
adjoint, and therefore by the Sturm-Liouville theory, Section 9.2, the solutions 
are expected to be orthogonal — if we can arrange to have appropriate boundary 
conditions satisfied. To take care of the boundary conditions, for a finite interval 
[0,a], we introduce parameters a and a vm into the argument of J v to get 
J v (a vm p/a)- Here a is the upper limit of the cylindrical radial coordinate p. 
From Eq. 11.22a 







+ 




(11.45) 


Changing the parameter ot vm to a v „, we find that J v (ot vn p/a) satisfies 


d 2 _ J 
p d P 2 Jv 






(11.45a) 


Proceeding as in Section 9.2, we multiply Eq. 11.45 by J v (ot vn p/a) and Eq. 11.45a 
by J v (cc vm p/a) and subtract, obtaining 




d . ( p 

p d P J A a « 


oc 2 — 0( 2 
^ vti ^vm 

~ “ r.2 


pJ v a. 


— JAol 


J x Xvn 


P\d_ 

n a) dp 


d 

dp"' 


P — JJ a„ 


5 ) 


Integrating from p = 0 to p = a, we obtain 


Jj a 


p\ d 


a J dp 


d ( p 
P—JA 0ty, m ~ 


dp 


dp 


~ I J A^rn P ) df) 


' A.! 

p dp " 




oc 2 — (X 2 

J 'vn Sm 


Jv ( ^vm ) d v ( ^vn 


dp 
pdp. 


(11.46) 


(11.47) 
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Upon integrating by parts, we see that the left-hand side of Eq. 1 1.47 becomes 



d_ 

dp 


J v 





(11.48) 


For v > 0 the factor p guarantees a zero at the lower limit, p = 0. Actually the 
lower limit on the index v may be extended down to v > — 1, Exercise 1 1.2.4. 1 
At p = a, each expression vanishes if we choose the parameters a VfJ and a vm 
to be zeros or roots of J v ; that is, J v (a v J = 0. The subscripts now become 
meaningful: a vm is the mth zero of J v . 

With this choice of parameters, the left-hand side vanishes (the Sturm- 
Liouville boundary conditions are satisfied) and for m ^ n 


C a 

o 






p dp — 0. 


(11.49) 


This gives us orthogonality over the interval [0,a]. 


Normalization 

The normalization integral may be developed by returning to Eq. 11.48, 
setting a vn = a vw + e, and taking the limit e -» 0 (compare Exercise 11.2.2). With 
the aid of the recurrence relation, Eq. 11.16, the result may be written as 


o 




2 

pdp = 


' L*^v+l (^vm)J • 


(11.50) 


Bessel Series 

If we assume that the set of Bessel functions J v {<x xm p/a) (v fixed, m = 1, 2, 3, ... ) 
is complete, then any well-behaved but otherwise arbitrary function /(p) may 
be expanded in a Bessel series (Bessel-Fourier or Fourier-Bessel) 


f(p) ^vm^v I ®vm n 

m = 1 V U > 


0 < p < a, V > — 1. 


The coefficients c vm are determined by using Eq. 11.50, 


f(p)JA *vm-)pdp- 


a 2 [J v+1 (a vm )] 2 


(11.51) 


(11.52) 


A similar series expansion involving J v (ji vm p/a) with (d/dp)J y (P vm p/a) \ p = a = 0 
is included in Exercises 11.2.3 and 11.2.6(b). 


EXAMPLE 11.2.1 Electrostatic Potential in a Hollow Cylinder 

From Table 8.2 of Section 8.3 (with a replaced by k ) our solution of Laplace’s 
equation in circular cylindrical coordinates is a linear combination of 

AmiP* <P>Z) = Pkm(P)®m(<P)Z k (z) 

(11.53) 

= J m (kp ) • [ a m sin rrup + b m cos mcp] • [ c x e kz -f c 2 e kz \ 


The case v = — 1 reverts to v = -hi, Eq. 1 1.8. 
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The particular linear combination is determined by the boundary conditions 
to be satisfied. 

Our cylinder here has a radius a and a height /. The top end section has a 
potential distribution \j/(p,(p). Elsewhere on the surface the potential is zero. 2 
The problem is to find the electrostatic potential 

\j/{p,(p,z)= X I l/ km (p,<p,z) (11.54) 

k,m 

everywhere in the interior. 

For convenience, the circular cylindrical coordinates are placed as shown 
in Fig. 11.4. Since i//{p, (p, 0) = 0, we take c 1 — ~c 2 = The z dependence 
becomes sinh/cz, vanishing at z = 0. The requirement that \j/ = 0 on the cylin- 
drical sides is met by requiring the separation constant k to be 


k k mn oc mn jci , (11.55) 

where the first subscript m gives the index of the Bessel function, whereas the 
second subscript identifies the particular zero of J m . 

The electrostatic potential becomes 


<p,z)= X X -L ( <*mn • K,„ sin mq> + b m „ cos imp] • sinh ( a„ m A (11 .56) 


\ 

Equation 11.56 is a double series: a Bessel series in p and a Fourier series in (p . 
At z = l,\j/ = \)j(p, (p), a known function of p and c p. Therefore 

r 

1 • I sin mcp + cos mcp I • sinh I 

a, 


£ Z +L COS m(p]- Sinh (fl 


(1 1.57) 


The constants a mn and b mn are evaluated by using Eqs. 1 1.49 and 1 1.50 and the 
•corresponding equations for sirup and cos cp (Example 9.2.1 and Eqs. 14.7 to 
14.9). We find 3 



= 2 


na 2 sinh 




-l 


n C a ( p\ fsinm^l 

'l'(p,<P)Jm a m „- W > pdp dip. 

Jo V <V|cosm<pJ 


(11.58) 


These are definite integrals, that is, numbers. Substituting back into Eq. 11.56 
the series is specified and the potential i j/(p, ip , z) is determined. The problem is 
solved. 


Continuum Form 

The Bessel series, Eq. 11.51, and Exercise 11.2.6 apply to expansions over 
the finite interval [0, a]. If a -* oo, then the series forms may be expected to go 
over into integrals. The discrete roots a vm become a continuous variable a. 


2 If \p = 0 at z 0, /, but =/= 0 for p = a, the modified Bessel functions. 
Section 1 1.5, are involved. 

3 If m = 0, the factor 2 is omitted (compare Eq. 14.8). 
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A similar situation is encountered in the Fourier series, Section 14.2. The 
development of the Bessel integral from the Bessel series is left as Exercise 
11 . 2 . 8 . 

For operations with a continuum of Bessel functions, J v (ap), a key relation 
is the Bessel function closure equation 

J J v (ccp)J v {a'p)pdp = t<5(a - a'). v > -f (11.59) 

This may be proved by the use of Hankel transforms, Section 15.1. An alternate 
approach, starting from a relation similar to Eq. 9.82, is given by Morse and 
Feshbach, Section 6.3. 

A second kind of orthogonality (varying the index) is developed for spherical 
Bessel functions in Section 11.7. 


EXERCISES 

11 . 2.1 (a) Show that 

(a 2 - b 2 ) | J v (ax)J v (bx)x dx = P[bJ v {aP)J'{bP) - aJ^(aP)J v (bP)l 
with 


>J> 




(b) 


P p 2 

[J v (ax)] 2 xdx = — — 


[j;(aP)] 2 + 1 


a 2 P' 


[7 v (aP)] : 


v > -1. 


These two integrals are usually called the first and second Lommel integrals. 
Hint. We have the development of the orthogonality of the Bessel functions as 
an analogy. 


11 . 2.2 Show that 


*0 


Jo 

L \ «/J 


pdp = y[7v+i(a„J] 2 . 


11 . 2.3 


Here ot vm is the mth zero of J v . 

Hint. With a V7J = a vm H- e, expand J v [(a vm 4- s)p/a] about <x vm p/a by a Taylor 
expansion. 

(a) If fi vm is the mth zero of (d/dp)J v {/3 vm p/a\ show that the Bessel functions 
are orthogonal over the interval [0, a] with an orthogonality integral 


J v \P vm -) J v {Pvn-)pdp = 0, v> -1. 


(b) Derive the corresponding normalization integral (m — n). 


ANS. 


1 - 


PI 


[7 V (/U] 2 , 


v > - 1. 


11 . 2.4 


Verify that the orthogonality equation, Eq. 11.49 and the normalization equa- 
tion, Eq. 11.50 hold for v > — 1. 
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Hint. Using power-series expansions, examine the behavior of Eq. 11.48 as 

P-0. 


1 1 .2.5 From Eq. 1 1.49 develop a proof that J v (z), v > — 1, has no complex roots. 
Hint. 

(a) Use the series form of J v (z) to exclude pure imaginary roots. 

(b) Assume a vm to be complex and take a v „ to be a* m . 


1 1 .2.6 (a) In the series expansion 


f(p) = X c vm J v 0 < p < a, v - 1, 

with J v (<x vm ) = 0, show that the coefficients are given by 
2 


a 2 [y v+1 (« v J] 2 J 

(b) In the series expansion 

Up) = Y d vm jJ p,J- 


f(p)J v [oi vm ~)pdp. 
a 


0 < p < n, v > — 1, 


with (d/dp)J v (fi vm p/a)\ p=a = 0, show that the coefficients are given by 


a 2 ( 1 - v 2 /A 2 ffl )[J v (AJ] 2 J 


f(p)M^)pdp. 


1 1 .2.7 A right circular cylinder has an electrostatic potential of i )/(p, q>) on both ends. 

The potential on the curved cylindrical surface is zero. Find the potential at 
all interior points. 

Hint. Choose your coordinate system and adjust your z dependence to exploit 
the symmetry of your potential. 


11.2.8 


For the continuum case, show that Eqs. 11.51 and 11.52 are replaced by 


f(p) = 


1*00 

a(oL)J v {<xp)da , 
o 


a{ a) = a 


f(p)J v (ctp)pdp. 

0 


Hint. The corresponding case for sines and cosines is worked out in Section 15.2. 
These are Hankel transforms. A derivation for the special case v = 0 is the 
topic of Exercise 15.1.1. 


1 1 .2.9 A function f(x ) is expressed as a Bessel series: 

f(x) = Y. 

n = l 


with ct mn the nth root of J m . Prove the Parseval relation 

[f{x)fxdx = \ Y «» 2 [^+i(0] 2 - 

Jo «~1 


11.2.10 Prove that 


Y (O 2 


/T—l 


1_ 

4 (m + 1)' 


Hint. Expand x m in a Bessel series and apply the Parseval relation. 
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1 1 . 2.1 1 A right circular cylinder of length l has a potential 


xjy (z = ±1/2) = 100(1 - p/a\ 

where a is the radius. The potential over the curved surface (side) is zero. Using 
the Bessel series from Exercise 11.2.7, calculate the electrostatic potential for 
p/a = 0.0(0.2)1.0 and z/l = 0.0(0. 1)0.5. Take a/l = 0.5. 

Hint From Exercise 11.1.30 you have 


Show that this equals 



a 0n/ 


i r°» 

— Jo(y)dy- 

a O« Jo 


Numerical evaluation of this latter form rather than the former is both faster 
and more accurate. 

Note . For p/a = 0.0 and z/l = 0.5 the convergence is slow, 20 terms giving 
only 98.4 rather than 100. 

Check value. For p/a = 0.4 and z/l = 0.3, 
= 24.558. 


1 1.3 NEUMANN FUNCTIONS. BESSEL FUNCTIONS 
OF THE SECOND KIND. N v (x) 

From the theory of differential equations it is known that Bessel’s equation 
has two independent solutions. Indeed, for nonintegrai order v we have already 
found two solutions and labeled them J v (x ) and J_ v (x), using the infinite series 
(Eq. 11.5). The trouble is that when v is integral Eq. 11.8 holds and we have but 
one independent solution. A second solution may be developed by the methods 
of Section 8.6. This yields a perfectly good second solution of Bessel’s equation 
but is not the usual standard form. 


Definition 

As an alternate approach, we take the particular linear combination of J v (x ) 
and J_ v (x) 


N v (x) = 


cos vnJ v (x) — J- V (x) 
sin vrt 


(11.60) 


This is the' Neumann function (Fig. 1 1.6). 1 For nonintegral v, N v (x) clearly 
satisfies Bessel’s equation, for it is a linear combination of known solutions, 
J v ( x) and J-v(x). However, for integral v, v = n, Eq. 11.8 applies and Eq. 11.60 
becomes indeterminate. The definition of N v (x) was chosen deliberately for this 
indeterminate property. Evaluating N n (x) by l’Hospital’s rule for indeterminate 
forms, we obtain 


In A MS-55 and in most mathematics tables, this is labeled T v (x). 
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, , = {d/dv)[cosvnJ v {x) - J_ v (x)] 

" j (d/dv) sin vn 

— ~ ns ^ nnn ^n( x ) + [cos nn d JJdv — dJ^Jdv] 
n cos nn 


1 

n 


dJ v {x) 

8v 


^ j \n dJ-v(x) 

dv 


v = n 


(11.61) 


Series Form 

A series expansion 2 gives the horrible result 


NJ X ) = ~JJx)\n ( ~ 
n \2 


1 00 1 


,=o r!(w + r)!V2 


[F(r) + F(n + r)] (11.62) 


l n j}(n — r— ly.fx 


'r=0 


r; 


-« + 2r 


which exhibits the logarithmic dependence that was to be expected. This, 
of course, verifies the independence of J n and N n . F(r) is the digamma function 
that arises from differentiating the factorials in the denominator of J v (x) 
(compare Section 10.2 and especially Eq. 10.39). Using the properties of the 
digamma function, we rewrite Eq. 11.62 in the only slightly less horrible form 


2 Using ( d/dv)x v = x v In x. 
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N„(x) = 


In 


+ y-\tp~ l 






2 

, (*/ 2 )' 


P~ 1 
x n + 2r r 


1 


iy (n - r -\)\(x 

7 T ^ 


r\ 


r!(n + r)!^ 

x 
2 


p 

\ ~n + 2r 


J.(x) 

U^j-' 

p p + n 


For n = 0we have the limiting value 


(11.63) 


and for v > 0 


jY 0 (x) = -(lnx + y — In 2) + 0(x 2 ) 

71 


Ny(x) = 


(v ~ f)- 

n 



3 


(11.64) 

(11.65) 


As with all the other Bessel functions, N v (x) has integral representations. 
For N 0 (x) we have 


N 0 (x) = 


2 

71 

2 

71 


^00 

| cos(x cosh t) dt 
o 

cos(xt) , 

(r 2 - 1) 1/2 ’ 


x > 0. 


(11.65«) 


These forms can be derived as the imaginary part of the Hankel representations 
_jof Exercise 11.4.5. The latter form is a Fourier cosine transform. 

To verify that iV v (x), our Neumann function (Fig. 11.6) or Bessel function 
of the second kind, actually does satisfy Bessel’s equation for integral n , we may 
proceed as follows. Differentiating Bessel’s equation for J ±v (x) with respect to v, 
we have 


8J + 


dx 2 V 8v 


+ x 


d fej 


±v 


dx\ dv 


+ {. X 2 


v 2 ) 


dJj^ 

dv 


2vJ, 


( 11 . 66 ) 


Multiplying the equation for by ( — l) v , subtracting from the equation for 
J v (as suggested by Eq. 11.61), and taking the limit v -> n, we obtain 

x 2 ~~N n + x ±N U + (x 2 - n 2 )N n = f[. J n - (- 1 )V_J. (11.67) 


For v = 77 , an integer, the right-hand side vanishes by Eq. 1 1.8 and N n (x) is 
seen to be a solution of Bessel’s equation. The most general solution for any v 
can therefore be written as 

y(x) = AJ v (x) + BN v (x). (1 1.68) 


It is seen from Eq. 11.62 that N n diverges at least logarithmically. Any boundary 
condition that requires the solution to be finite at the origin [as in our vibrating 
circular membrane (Section 11.1)] automatically excludes N n (x). Conversely, 
in the absence of such a requirement N n {x ) must be considered. 


3 Note that this limiting form applies to both integral and nonintegral values 
of the index v. 
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To a certain extent the definition of the Neumann function N n (x) is arbitrary. 
Equation 11.63 contains terms of the form a n J n (x). Clearly, any finite value of 
the constant a n would still give us a second solution of Bessel’s equation. Why 
should a n have the particular value shown in Eq. 11.63? The answer involves 
the asymptotic dependence developed in Section 11.6. If J n corresponds to a 
cosine wave, then N n corresponds to a sine wave. This simple and convenient 
asymptotic phase relationship is a consequence of the particular admixture of 
J n in N n . 


Recurrence Relations 

Substituting Eq. 11.60 for N v (x) (nonintegral v) or Eq. 11.61 (integral v) 
into the recurrence relations (Eqs. 11.10 and 1 1.12) for J n {x), we see immediately 
that N v (x) satisfies these same recurrence relations. This actually constitutes 
another proof that iV v is a solution. Note carefully that the converse is not 
necessarily true. All solutions need not satisfy the same recurrence relations. 
An example of this sort of trouble appears in Section 11.5. 


Wronskian Formulas 

From Section 8.6 and Exercise 9.1.4 we have the Wronskian formula 4 
for solutions of the Bessel equation 

U v (x)t/ V (x) - w;.(xR(x) = (1 1.69) 


in which A v is a parameter that depends on the particular Bessel functions 
u v (x) and v v (x) being considered. It is a constant in the sense that it is independent 
of x. Consider the special case 

u v (x) = / v (x), v v (x) = J- V (x\ (1 1.70) 

j v j: v - j;j_ v = (ii.7i) 


Since A v is a constant, it may be identified at any convenient point such as 
x = 0. Using the first terms in the series expansions (Eqs. 11.5 and 11.6), we 
obtain 


J 

J 

2 v x^ v 

v 2 v v!’ 

VX v_1 

J'-, 

’(-v)!’ 

— v2 v x'* v “ 1 

v ’ 2 v v! ’ 

( — v) ! ' 

11.69 yields 

J x (x)J'- M) - 


(x) = 

A ’ xv ! ( — v) ! 


2 sin vn 

7ZX 


(11.72) 


(11.73) 


4 This result depends on P(x) of Section 8.5 being equal to p'(x)/p(x), the 
corresponding coefficient of the self-adjoint form of Section 9.1. 
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using Eq. 10.32, we have 


v -v * — ■ 

sin nv 

Note that A v vanishes for integral v, as it must, since the nonvanishing of the 
Wronskian is a test of the independence of the two solutions. By Eq. 11.73 
J n and J^ n are clearly linearly dependent. 

Using our recurrence relations, we may readily develop a large number of 
alternate forms, among which are 

(11.74) 

(11.75) 
111.76) 
(11.77) 

Many more will be found in the references given. 

The reader will recall that in Chapter 8 Wronskians were of great value in 
two respects: (1) in establishing the linear independence or linear dependence 
of solutions of differential equations and (2) in developing an integral form of a 
second solution. Here the specific forms of the Wronskians and Wronskian- 
derived combinations of Bessel functions are useful primarily to illustrate the 
general behavior of the various Bessel functions. Wronskians are of great use 
in checking tables of Bessel functions. In Chapter 16 Wronskians reappear in 
connection with Green’s functions. 

EXAMPLE 11.3.1 Coaxial Wave Guides 

We are interested in an ^electromagnetic wave confined between the con- 
centric, conducting cylindrical surfaces p — a and p — b. Most of the mathe- 
matics is worked out in Section 2.6 and Example 1 1.1.2. To go from the standing 
wave of these examples to the traveling wave here, we let a mn = ib mn in Eq. 1 1.40 
and obtain 

£,= I b m „JJyp)e ±im «e^-"-\ (1 1.78) 

m,n 

Additional properties of the components of the electromagnetic wave in the 
simple cylindrical wave guide are explored in Exercises 11.3.9 and 10. For the 
coaxial wave guide one generalization is needed. The origin p = 0 is now 
excluded (0 < a < p < b). Hence the Neumann function N m (yp) may not be 
excluded. E z (p, <p, z, t ) becomes 

E z = I [h JJyp) + c m „N m (yp)~\e ±inui> e lik2 ~ <0 ’ ) . 

m,n 


j j +j j - 2 sin - 


J v J — V 1 “I" ^-v^v+1 — 


2 sin V7i 
nx 


•w-w--. 


J v N v+i J V+1 N V — 


(11.79) 
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With the condition 

H z = 0, (11.80) 

we have the basic equations for a TM (transverse magnetic) wave. 

The (tangential) electric field must vanish at the conducting surfaces (Dirichlet 
boundary condition) or 

b m „JJya) + c mn NJya) = 0. (11.81) 

b mn J m (yt>) + c mn N m (yb) = 0. (1 1.82) 

These transcendental equations may be solved for y (y mn ) and the ratio c m Jb mn . 
From Example 11.1.2, 

k 2 = w 2 n 0 s 0 -y 2 =^y-y 2 . (1 1.83) 

C 

Since k 2 must be positive for a real wave, the minimum frequency that will be 
propagated (in this TM mode) is 

<x> = yc, (11.84) 

with y fixed by the boundary conditions, Eqs. 11.81 and 11.82. This is the cutoff 
frequency of the wave guide. 

There is also a TE (transverse electric) mode with E z = 0, and H z given by 
Eq. 11.79. Then we have Neumann boundary conditions in place of Eqs. 11.81 
and 11.82. Finally, for the coaxial guide ( not for the plain cylindrical guide, 
a = 0), a TEM (transverse electromagnetic) mode, E z = H z = 0, is possible. 
This corresponds to a plane wave as in free space. 

The simpler cases (no Neumann functions, simpler boundary conditions) of 
a circular wave guide are included as Exercises 11.3.9 and 11.3.10. 

To conclude this discussion of Neumann functions, we introduce the Neu- 
mann function, N v (x\ for the following reasons: 

1. It is a second, independent solution of Bessel’s equa- 
tion, which completes the general solution. 

2. It is required for specific physical problems such as 
electromagnetic waves in coaxial cables. 

3. It leads to a Green’s function for the Bessel equation 
(Sections 16.5 and 16.6). 

4. It leads directly to the two Hankel functions (Section 
11.4). 


EXERCISES 


1 1 . 3.1 Verify the expansions (leading term only) 

N 0 {x) -»• -(In x + y — In 2) 
n 


N,w 




71 


X 


V > 0 
X « 1. 
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For N 0 ( x) differentiate the definition of the Neumann function as indicated 
inEq. 11.61. 

1 1 . 3.2 Prove that the Neumann functions N„ (with n an integer) satisfy the recurrence 
relations 

N k -i(x) + K +1 (x) = -NJx) 

X 

N n ^(x) - N„ +1 (x) = 2N’(x). 

Hint. These relations may be proved by differentiating the recurrence relations 
for J v or by using the limit form of N v but not dividing everything by zero. 

11 . 3.3 Show that 

N_ n (x) = (-l) n N n (x). 

11 . 3.4 Show that 

K(x) = -N,(x). 

1 1 . 3.5 If Y and Z are any two solutions of Bessel’s equation, show that 

YJx)z;ix) - y;(x)Z x (x) = 

X 

in which A v may depend on v but is independent of x. This is really a special 
case of Exercise 9.1.4. 

1 1 . 3.6 Verify the Wronskian formulas 

J v (x)J- v+i (x ) + J.,(*)/,. 1 (x) = ^^5, 

nx 

J v {x)N:\x) - J;,(x)N v (x) = — . 

nx 


1 1 . 3.7 As an alternative to letting x approach zero in the evaluation of the Wronskian 
constant, we may invoke uniqueness of power series (Section 5.7). The coefficient 
of x' 1 in the series expansion of u v (x)i/ v (x) — i^,(x)t; v (x) is then A v . Show by 
series expansion that the coefficients of x° and x 1 of J v (x)Jl v (x) — J'(x)J_ v (x) 
are each zero. 


1 1 . 3.8 (a) By differentiating and substituting into Bessel’s differential equation, show 

that 


r 


cos(x cosh t ) dt 


is a solution. 

Hint. You can rearrange the final integral as 
d 


■*00 

Jo 


dt 


{x sin(x cosh t) sinh t } dt. 


(b) Show that 


at 0 (x) = -- r 

71 Jo 


cos(x cosh t)dt 


is linearly independent of J 0 {x). 
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11 . 3.9 A cylindrical wave guide has radius r 0 . Find the nonvanishing components 
of the electric and magnetic fields for 

(a) TM 01 , transverse magnetic wave (H z = H p = E 9 = 0), 

(b) TE 0 j , transverse electric wave (E z = E p = = 0). 

The subscripts 01 indicate that the longitudinal component (E z or H z ) involves 
J 0 and the boundary condition is satisfied by the first zero of J 0 or J’ 0 . 

Hint All components of the wave have the same factor: expi(kz — wt). 

1 1 . 3.1 0 For a given mode of oscillation the minimum frequency that will be passed by a 
circular cylindrical wave guide (radius r 0 ) is 

c 



in which /. c is fixed by the boundary condition 

for TM„ W mode, 


for TE„ m mode. 

The subscript n denotes the order of the Bessel function and m indicates the 
zero used. Find this cut-off wavelength, a c for the three TM and three TE modes 
with the longest cut-off wavelengths. Explain your results in terms of the graph 
of J 0 ,J U and J 2i (Fig. 11.2). 

11 . 3.1 1 Write a program that will compute successive roots of the Neumann function 
N„(x ); that is, a ns , where N n (oi ns ) = 0. Tabulate the first five roots of 1V 0 , IVj, 
and N 2 - Check your values for the roots against those listed in AMS-55 
(Chapter 9). 

Hint. See Appendix 1 for root-finding techniques and recommendations. 

Check value. a 12 = 5.42968. 
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11 . 3.12 For the case m = 0, a = 1 , and b — 2 the coaxial wave guide boundary con- 
ditions lead to 

r/\ _ Jq(2x) ^p(x) 

~ AT 0 ( 2x) N 0 (x) 

(Fig. 11.7). 

(a) Calculate f{x)f orx = 0.0(0. 1)10.0 and plot /(x) versus jc to find the approxi- 
mate location of the roots. 

(b) Call a root-finding subroutine to determine the first three roots to higher 
precision. 

ANS. 3.1230,6.2734,9.4182. 
Note. The higher roots can be expected to appear at intervals whose length 
approaches n. Why? AMS-55, Section 9.5, gives an approximate formula for 
the roots. The function g(x) = J 0 {x)N 0 {2x) — J 0 (2x)N 0 (x) is much better be- 
haved than f{x) previously discussed. 


11.4 HANKEL FUNCTIONS 

Many authors prefer to introduce the Hankel functions by means of integral 
representations and then use them to define the Neumann function, N v (z). 
An outline of this approach is given at the end of this section. 

Definitions 

As we have already obtained the Neumann function by more elementary 
(and less powerful) techniques, we may use it to define the Hankel functions, 
H?\x) and H™{x): 

H[ 1) (x) = J v (x) 4- iN v (x) (11.85) 

and 

flj 2) (x) = J v {x)-iN v (x). (11.86) 

This is exactly analogous to taking 

e ±w = cosO ± i sin 0. (11.87) 

For real arguments H^ l) and FT$ 2) are complex conjugates. The extent of the 
analogy will be seen even better when the asymptotic forms are considered 
(Section 11.6). Indeed, it is their asymptotic behavior that makes the Hankel 
functions useful. 

Series expansion of Hl 1] (x) and Hl 2 \x) may be obtained by combining Eqs. 

11.5 and 11.63. Often only the first term is of interest; it is given by 

H^\x) K ihnx + 1 4- i~(y - In 2) + • • • , (11.88) 

H?\x) * - + . . . , v > 0, (1 1.89) 

HfXx) « -i-lnx + 1 - i-(y - In 2) + • • •, (11.90) 

71 71 

Hl 2) (x) « + • • • , v > 0. (11.91) 
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Since the Hankel functions are linear combinations (with constant coeffi- 
cients) of J v and ,Y V , they satisfy the same recurrence relations (Eqs. 11.10 and 
11 . 12 ). 

H v -i(x) + H v+1 (x) = ~H v (x), (1 1.92) 

— H v+1 (x) = 2H',(x), (11.93) 

for both Hl 1] (x) and H^ 2 \x). 

A variety of Wronskian formulas can be developed: 

- HXH[ 2 \ = A, (1 1-94) 

inx 

1\=— , (11.95) 

inx 

J v Hl 2 -\ - ^v-i #v 2) = (11.96) 

inx 

EXAMPLE 11.4.1 Cylindrical Traveling Waves 

As an illustration of the use of Hankel functions, consider a two-dimensional 
wave problem similar to the vibrating circular membrane of Exercise 11.1.25. 
Now imagine that the waves are generated at r = 0 and move outward to 
infinity. We replace our standing waves by traveling ones. The differential 
equation remains the same, but the boundary conditions change. We now 
demand that for large r the solution behave like 

U _+ e i{kr-at) ( 11 . 97 ) 

to describe an outgoing wave. As before, k is the wave number. This assumes, 
for simplicity, that there is no azimuthal dependence, that is, no angular 
momentum, or m = 0. In Sections 7.4 and 11.6, H^ 1] (kr) is shown to have the 
asymptotic behavior 

H^\kr) — ► e ikr . (11.98) 

This boundary condition at infinity then determines our wave solution as 

U (r, t) = H^ l \kr)e~ iwt . (11.99) 

This solution diverges as r -> 0, which is just the behavior to be expected with 
a source at the origin. 

The choice of a two-dimensional wave problem to illustrate the Hankel 
function H^ 1] (z) is not accidental. Bessel functions may appear in a variety of 
ways, such as in the separation of conical coordinates. However, they enter 
most commonly from the radial equations from the separation of variables in 
the Helmholtz equation in cylindrical and in spherical polar coordinates. We 
have taken a degenerate form of cylindrical coordinates for this illustration. 
Had we used spherical polar coordinates (spherical waves), we should have 
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encountered index v = n + n an integer. These special values yield the 
spherical Bessel functions to be discussed in Section 11.7. 


Contour Integral Representation of the 
Hankel Functions 

The integral representation (Schlaefli integral) 


2m 


y(x}2){i~\lt) 


dt 


t 


'+] 


( 11 . 100 ) 


may easily be established for v = n, an integer [recognizing that the numerator 
is the generating function (Eq. 11.1) and integrating around the origin]. If v is 
not an integer, the integrand is not single- valued and a cut line is needed in our 
complex plane. Choosing the negative real axis as the cut line and using the 
contour shown in Fig. 11.8, we can extend Eq. 11.100 to nonintegral v. Sub- 
stituting Eq. 11.100 into Bessel’s differential equation, we can 'represent the 
combined integrand by an exact differential that vanishes as t -► oo e ±in (compare 
Exercise 11.1.16). 


v 



FIG. 1 1.8 Bessel function contour 


We now deform the contour so that it approaches the origin along the positive 
real axis, as shown in Fig. 11.9. This particular approach guarantees that the 
exact differential mentioned will vanish as t-+0 because of the e~ x/2t factor. 
Hence each of the separate portions co e~ lK to 0 and 0 to oo e in is a solution of 
Bessel’s equation. We define 


= - 
Til 

H ( v 2 \x ) = - 

Kl 


0 (x/2)(t-l/t) 


dt 


t v 


y (x/2)(t~l/t) 


dt 


r 


( 11 . 101 ) 

( 11 . 102 ) 


These expressions are particularly convenient because they may be handled 
by the method of steepest descents (Section 7.4). H { v 1] (x) has a saddle point at 
t — + i, whereas Hl 2) (x) has a saddle point at t — — i. 
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y 



FIG. 1 1.9 Hankel function contours 

The problem of relating Eqs. 11.101 and 11.102 to our earlier definition of 
the Hankel function (Eqs. 1 1.85 and 11.86) remains. Since Eqs. 1 1.100 to 11.102 
combined yield 

J v (x) = i[// v U) (x) + tf< 2) (x)] (11.103) 

by inspection, we need only show that 

N v (x) = - H[ 2 \x)l (11.104) 

This may be accomplished by the following steps: 

1. With the substitutions! = e in jsior H< 1} and t = e~ iK /s 
for H ( v 2 \ we obtain 

Hl'Xx) = e- ivn H<_\Xxl (11.105) 

\x) = e iv *H™(x). (11.106) 

2. From Eqs. 1 1. 103 (v ^ - v), and 1 1. 105 and 1 1. 106, 

J-vM = \[_e ivn H?\x) + e~ ivn Hi 2) (x)l (11.107) 

3. Finally, substitute J v (Eq. 1 1.103) and J_ v (Eq. 11.107) 
into the defining equation for iV v , Eq. 1 1.60. This leads 
to Eq. 11.104 and establishes the contour integrals 
Eqs. 11.101 and 11.102 as the Hankel functions. 

Integral representations have appeared before: Eq. 10.35 for T(z) and various 
representations of J v (z) in Section 11.1. With these integral representations of 
the Hankel functions, it is perhaps appropriate to ask why we are interested in 
integral representations. There are at least four reasons. The first is simply 
aesthetic appeal — some people find them attractive. Second, the integral repre- 
sentations help to distinguish between two linearly independent solutions. In 
Fig. 11.7, the contours C x and C 2 cross different saddle points (Section 7.4). 
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For the Legendre functions the contour for P n (z) (Fig. 129) and that for Q n (z ) 
encircle different singular points. 

Third, the integral representations facilitate manipulations, analysis, and the 
development of relations among the various special functions. Fourth, and 
probably most important of all, the integral representations are extremely useful 
in developing asymptotic expansions. One approach, the method of steepest 
descents, appears in Section 7.4. A second approach, the direct expansion of 
an integral representation is given in Section 11.6 for the modified Bessel func- 
tion K v (z). This same technique may be used to obtain asymptotic expansions 
of the confluent hypergeometric functions, M and U — Exercise 13.6.13. 

In conclusion, the Hankel functions are introduced here for the following 
reasons : 

1. As analogs of e ±ix they are useful for describing 
traveling waves. 

2 They offer an alternate (contour integral) and a rather 
elegant definition of Bessel functions. 

3. Hl l) is used to define the modified Bessel function K v 
of Section 11.5. 


EXERCISES 


11 .4.1 Verify the Wronskian formulas 

(a) - JMH^(x) = 

(b) J v (x)Hi 2) '(x) - Jl(x)Hi 2 \x) = 


j2 

7ix’ 

— i2 


(c) N v {x)Hi l) \x) - N'{x)Hi l \x) = 

(d) N v (x)H ( v 2) '(x) - N:(x)Hi 2 \x) = 


TLX 
-2 
71X ’ 
-2 


(e) H l v l) (x)H$ 2) '{x) - H^'{x)H ( v 2] {x) = 

(f) Hl 2 >(x)HH\(x) - Hj 1} (x)Hj? \{x) = 


-i 4 

71X 

4 


(g) J v . x (x)H^(x) - J v (x)H< i\(x) = ; 
1 1 .4.2 Show that the integral forms 
(a) 


inx 


(b) 


ir 

m Jo< 

i r 
™ L 




M2W-1/I) dt _ 


■"■c. 




satisfy Bessel’s differential equation. The contours C[ and C 2 are shown in Fig. 
11.9. 


1 1 .4.3 Using the integrals and contours given in problem 11.4.2, show that 
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j. - «: 2, (x)] = NJx). 


1 1 .4.4 Show that the integrals in Exercise 11.4.2 may be transformed to yield 


(a) Hi 1 \x) = - 

TCI 

(b) H[ 2 >(X) = - 1 ; 

711 

(see Fig. 11.10). 


j e xsinh y-vy^ 
^sinh y-vy dy 



FIG. 11.10 Hankel function contours 


1 1 .4.5 (a) Transform Eq. 1 1.101, into 


(b) 


Htf’M = - 

in 


J g ix cosh j 
C 


ds, 


where the contour C runs from — oo — m/2 through the origin of the s-plane 
to oo 4* in/2. 

Justify rewriting H { q\x) as 



■*c» + in/2 

e 

o 


IX cosh £ 


ds. 


(c) Verify that this integral representation actually satisfies Bessel’s differential 
equation. (The in/2 in the upper limit is not essential. It serves as a conver- 
gence factor. We can replace it by ian/2 and take the limit. 


11.4.6 From 


show that 


K l \x) = l 

in 


'*00 

g ixcoshs 

0 


ds 


(a) J 0 {x ) 

(b) J 0 (x) 


2 r 
71 Jo 


I sin(x cosh s) ds. 
sin(xt) 


2 r 
n Ji 




:dt. 


This last result is a Fourier sine transform. 
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11 . 4.7 From 


H ( 0 % c) = — [ e‘ xmh " ds 

in Jo 

show that 


(a) 

(b) 


N 0 (x) = 


Nq(x) = 


2 

n 


fee 

cos (x cosh s)ds. 
o 


_ 2 r 


cos(xf) 


1 


dt 


These are the equations given as Eq. 11.65a. 
This last result is a Fourier cosine transform. 


1 1.5 MODIFIED BESSEL FUNCTIONS, J v (x) and JC v (x) 

The Helmholtz equation, 

\ 2 \j/ + k 2 \j/ = 0, 

separated in circular cylindrical coordinates, leads to Eq. 11.22a, the Bessel 
equation. Equation 11.22a is satisfied by the Bessel and Neumann functions 
J v (kp) and N v (kp) and any linear combination such as the Hankel functions 
H { v i] (kp) and H [2 \kp ). Now the Helmholtz equation describes the space part 
of wave phenomena. If instead we have a diffusion problem, then the Helmholtz 
equation is replaced by 

VV-/c 2 <A = 0. (11.108) 

The analog to Eq. 11.22a is 

p 2 T ~2 Y v (kp) + P~ Y v (kp ) - ( k 2 p 2 + v 2 )Y v (M = 0. (11.109) 

ap z dp 

The Helmholtz equation may be transformed into the diffusion equation by 
the transformation k -» ik. Similarly, k ik changes Eq. 11.22a into Eq. 11.109 
and shows that 


Y v (kp) = Z v (ikp). 

The solutions of Eq. 11.109 are Bessel functions of imaginary argument. To 
obtain a solution that is regular at the origin, we take Z v as the regular Bessel 
function J v . It is customary (and convenient) to choose the normalization so 
that 

Y v (kp) = I v (x) = i~ v J v (ix). (11.110) 

(Here the variable kp is being replaced by x for simplicity.) Often this is written 
as 

7 v (x) = e~ vnit2 J v {xe l1t/2 ). 

/ 0 and I x are shown in Figure 11.11. 


(1U11) 
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FIG. 11.11 Modified Bessel functions 


Series Form 

In terms of infinite series this is equivalent to removing the (— l) s sign in 
Eq. 11.5 and writing 


= 


00 2 

J- 0 s!(s + v)! 



f-v( x ) ~ E 


i 


o s ! ( s - v)! 



( 11 . 112 ) 


The extra i v normalization cancels the i v from each term and leaves / v (x) real. 
For integral v this yields 


/„(x) = I- n (x). 


(11.113) 


Recurrence Relations 

The recurrence relations satisfied by 7 v (x) may be developed from the series 
expansions, but it is perhaps easier to work from the existing recurrence relations 
for J v (x). Let us replace x by —ix and rewrite Eq. 11.110 as 

J v (x) = i v I v (-ix). (11.114) 


Then Eq. 11.10 becomes 

-F i v+1 I v+l ( — ix) = ^/ v / v (-ix). 
Replacing x by ix , we have a recurrence relation for / v (x), 
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Equation 11.12 transforms to 

Iy-i(x) + I y+1 (x) = 2/;(x). (11.116) 

These are the recurrence relations used in Exercise 11.1.14. 

It is worth emphasizing that although two recurrence relations, Eqs. 11.115 
and 11.116 or Exercise 11.5.7, specify the second-order differential equation, 
the converse is not true. The differential equation does not uniquely fix the 
recurrence relations. Equations 11.115 and 11.116 and Exercise 11.5.7 provide 
an example. 

From Eq. 11.113 it is seen that we have but one independent solution when 
v is an integer, exactly as in the Bessel functions J v . The choice of a second, 
independent solution of Eq. 11.108 is essentially a matter of convenience. The 
second solution given here is selected on the basis of its asymptotic behavior — 
as shown in the next section. The confusion of choice and notation for this 
solution is perhaps greater than anywhere else in this field. 1 Many authors 2 
choose to define a second solution in terms of the Hankel function H { v 1} (x) by 


K y {x) = ji t+l Hi u (ix) 

= ~r +L [j v (ix) + iNjix)]. 


(11.117) 


The factor i v+1 makes K v (x) real when x is real. Using Eqs. 11.60 and 11.110, 
we may transform Eq. 11.117 to 3 


k,M- 


sin vn 


(11.118) 


analogous to Eq. 11.60 for N v (x). The choice of Eq. 11.117 as a definition is 
somewhat unfortunate in that the function K v (x) does not satisfy the same 
recurrence relations as 7 v (x) (compare Exercises 11.5.7 and 11.5.8). To avoid 
this annoyance other authors 4 have included an additional factor of cosine 
nn. This permits K v to satisfy the same recurrence relations as 7 V , but it has the 
disadvantage of making K v = 0 for v = j, \ , f , .... 

The series expansion of K v (x) follows directly from the series form of 77 v (1) (/x). 
The lowest order terms are 

K 0 (x) = — lnx — y + ln2 + •••, 

(11.119) 

K v (x) = 2 V-I (v — l)!x -v + 

Because the modified Bessel function 7 V is related to the Bessel function J v , 
much as sinh is related to sine, 7 V and the second solution K v are sometimes 
referred to as hyperbolic Bessel functions. 


1 A discussion and comparison of notations will be found in MTAC 1, 
207-308 (1944). 

2 Watson, Morse and Feshbach, Jeffreys and Jeffreys (without the n/2). 

3 For integral index n we take the limit as v -+ n. 

4 Whittaker and Watson. 
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I 0 (x) and K 0 (x) have the integral representations 

I 0 (x) = - cosh(xcos 6)d0 (11.120) 

71 Jo 

K 0 {x)= cos(x sinh t)dt = J ^ S ^y /2 > x > 0. (11.121) 

Equation 11.120 may be derived from Eq. 11.30 for J 0 (x) or may be taken 
as a special case of Exercise 11.5.4, v = 0. The integral representation of K 0 , 
Eq. 11.121, is a Fourier transform and may best be derived with Fourier trans- 
forms, Chapter 15, or with Green’s functions, Section 16.6. A variety of other 
forms of integral representations (including v 0) appear in the exercises. 
These integral representations are useful in developing asymptotic forms (Sec- 
tion 11.6) and in connection with Fourier transforms, Chapter 15. 

To put the modified Bessel functions I v (x) and K v (x ) in proper perspective, 
we introduce them here because: 

1. These functions are solutions of the frequently en- 
countered modified Bessel equation. 

2. They are needed for specific physical problems such 
as diffusion problems. 

3. K v {x) provides a Green’s function, Section 16.6. 

4. K y {x) leads to a convenient determination of asym- 
ptotic behavior (Section 11.6). 


EXERCISES 

11.5.1 Show that 

£>(x/2 )( t+i/t) __ £ / M (jc)r", 

n— — oo 

thus generating modified Bessel functions, I n (x). 

1 1 .5.2 Verify the following identities 

(a) 1 = I 0 (x) + 2 f 

n — 1 

(b) e* = I 0 (x) + 2 £ IJx\ 

n = l 

(c) e-* = / 0 (x) + 2 £ (-l)"/„(x), 

n = 1 

(d) coshx = / 0 (x) + 2 £ l 2 „(x), 

n = 1 
oo 

(e) sinhx = 2 ^ 

n = 1 

11.5.3 (a) From the generating function of Exercise 1 1.5.1 show that 



614 BESSEL FUNCTIONS 


Ux) = “ j> exp[(x/2)(t + 

(b) For n = v, not an integer, show that the preceding integral representation 
may be generalized to 

exp[(x/2)(f + 1/t)] 

The contour C is the same as that for J v (x), Fig. 11.8. 



11 . 5. 4 Forv> — j show that / v (z) may be represented by 

i / 7 \ v r n 

I(z)~ /-I e ±=cos0 sin 2v ()d() 

v( > n^{v-{)\\2j J 0 Sm UM 




n il2 (v — 2 ) * V 2 


'n/2 


cosh(z cos 0) sin 2v 0 dO. 


1 1 . 5.5 A cylindrical cavity has a radius a and a height /, Fig. 1 1.4. The ends, z ~0 and 
/, are at zero potential. The cylindrical walls, p = a, have a potential V = V(<p, z). 
(a) Show that the electrostatic potential <F(p, cp, z) has the functional form 

00 00 

®(p, <p, z) = Y X L(Kp) sin k n z • sin m<p + b„ m cos rmp), 

m= 0 n=l 


where k n ~ — . 

(b) Show that the coefficients a mn and b mn are given by 5 


nll m (k„d) 


1 f 1 r f sin nup ) 

V((p, z ) sin k n z. < >dzd(p. 

Jo (cos imp ) 


Hint. Expand V(cp,z) as a double series and use the orthogonality of the trigono- 
metric functions. 


1 1 . 5.6 Verify that K v (x) is given by 


* v(*) = 


n J_ v (x) — / v (x) 


2 sin V7i 

and from this show that 

K v (x) = K_ v (x). 

1 1 .5.7 Show that X v (x) satisfies the recurrence relations 

i(x) - K +l (x) = ——KJx), 
x 

K v ~i(x) + K v+ i(x) = -2 K'(x). 

11.5.8 If Jf v = e vni K v , show that satisfies the same recurrence relations as 7 V . 


5 When m = 0, the 2 in the coefficient is replaced by 1. 
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11. 5. 9 Forv> — j show that K v {z) may be represented by 

n m 

K v (z) — — — (^) | e~ z cosht sinh 2v tdt, 


e~ zp (p 2 - 1 y~ l/2 dp. 


(v~i)!\2 


n n 

— < arg z < 

2 2 


t */2 


(v — 2 ) ■ \2 


1 1 .5.1 0 Show that I v (x) and K v (x ) satisfy the Wronskian relation 

J v (x)K;(x) - II(X)K V {X) = 

This result is quoted in Section 16.6 in the development of a Green’s function. 


11.5.11 If r — (x 2 + y 2 ) 1/2 , prove that 


1 

r 


2 

n 


'OC 

Jo 


cos(xr)K 0 (yf) dt. 


This is a Fourier cosine transform of K 0 . 


11.5.12 (a) Verify that 


(b) 

(c) 


'oW = 


1 

n 


*n 

cosh(x cos 0) di) 
0 


satisfies the modified Bessel equation, v = 0. 

Show that this integral contains no admixture of K 0 {x\ the irregular 
second solution. 

Verify the normalization factor l/n. 


1 1 .5.1 3 Verify that the integral representations 


7„(z) = — e zcosr cos (nt)dt, 

n 


KM 


f*ao 

e~ z cosh 1 cosh(vt) dt, 
0 


M(z) > 0, 


satisfy the modified Bessel equation by direct substitution into that equation. 
How can you show that the first form does not contain an admixture of K n , 
that the second form does not contain an admixture of / v ? How can you check 
the normalization? 


1 1 .5.1 4 Derive the integral representation 

/„(*) = - j e xcosfl cos (n0)d0. 

71 Jo 

Hint. Start with the corresponding integral representation of J n (x). Equation 
11.120 is a special case of this representation. 

11.5.15 Show that 


K 0 (z) 


f*Q 0 

e zcoshi dt 


Jo 

satisfies the modified Bessel equation. How can you establish that this form is 
linearly independent of / 0 ( z )? 
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11.5.16 Show that 

«“ = lo(a)T 0 (x) + 2 f I„(a)T„(x), - 1 < x < 1. 

n = 1 

T„(x) is the rcth-order Chebyshev polynomial, Sections 13.3 and 13.4. 

Hint. Assume a Chebyshev series expansion. Using the orthogonality and 
normalization of the T„(x), solve for the coefficients of the Chebyshev series. 

1 1 .5.1 7 (a) Write a double precision subroutine to calculate 7„(x ) to 12-decimal place 

accuracy for n = 0, 1, 2, 3, ... and 0 < x < 1. Check your results against 
the 10 place values given in AMS-55, Table 9.1 1. 

(b) Referring to Exercise 11.5.16, calculate the coefficients in the Chebyshev 
expansions of cosh x and of sinh x. 

Note. An alternate calculation of these coefficients is one of the topics of Section 
13.4. 

1 1 .5.1 8 . The cylindrical cavity of Exercise 1 1.5.5 has a potential along the cylinder walls 

[100 z/h 0 < z/l < 1/2 

[Z) |l00(l-z//), 1/2 < z/l < 1. 

With the radius-height ratio a/l = 0.5, calculate the potential for z/l = 0. 1 (0. 1)0.5 
and p/a = 0.0(0. 2) 1.0. 

Check value. For z/l — 0.3 and p/a — 0.8, V = 26.396. 

1 1.6 ASYMPTOTIC EXPAHSIONS 

Frequently in physical problems there is a need to know how a given Bessel 
or modified Bessel function behaves for large values of the argument, that is, 
the asymptotic behavior. This is one occasion when computers are not very 
helpful. One possible approach is to develop a power-series solution of the 
differential equation, as in Section 8.5, but now using negative powers. This is 
the Stokes’s method. Exercise 11.6.5. The limitation is that starting from some 
positive value of the argument (for convergence of the series), we do not know 
what mixture of solutions or multiple of a given solution we have. The problem 
is to relate the asymptotic series (useful for large values of the variable) to the 
power series or related definition (useful for small values of the variable). This 
relationship can be established by introducing a suitable integral representation 
and then using either the method of steepest descent, Section 7.4, or the direct 
expansion as developed in this section. 

Expansion of an Integral Representation, K v (z) 

As a direct approach, consider the integral representation (Exercise 11.5.9) 

= JV“(* 2 -ir 1/2 <fc. v > -t (n.122) 

For the present let us take z to be real, although Eq. 1 1.136 may be established 
for — n/2 < argz < n/2(0t{z) > 0). We have three problems: (1) to show that 
K v as given in Eq. 11.122 actually satisfies the modified Bessel equation (1 1.108); 
(2) to show that the regular solution 7 V is absent ; and (3) to show that Eq. 1 1 . 122 
has the proper normalization. 
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1. The fact that Eq. 1 L 122 is a solution of the modified 
Bessel equation may be verified by direct substitu- 
tion. We obtain 


z v+1 f 4-l e ~ zx (x 2 - 1 y +m \dx = 0, 

L dx 

which transforms the combined integrand into the 
derivative of a function that vanishes at both end 
points. Hence the integral is some linear combina- 
tion of J v and K v . 

2. The rejection of the possibility that this solution 
contains J v constitutes Exercise 11.6.1. 

3. The normalization may be verified by substituting 

x = 1 -f t/z. 


n 112 /zY 



i y~ 1/2 dx 



(11.123a) 


rl/2 


(v-i)!2 v z v 




1 + 


2 z 


' — 1/2 


dt, (U.123b) 


taking out t 2 /z 2 as a factor. This substitution has 
changed the limits of integration to a more conve- 
nient range and has isolated the negative exponential 
dependence, e~ z . The integral in Eq. 11.1236 may be 
evaluated for z = 0 to yield (2v — 1)! Then, using 
the duplication formula (Section 10.4), we have 

(v - 1) ? 2 V_1 

lim K v (z) = , v > 0, (11.124) 


in agreement with Eq. 11.119, which thus checks 
the normalization. 1 


Now to develop an asymptotic series for K v (z\ we may rewrite 11.123a as 


K v (z) = 


2z (v - i) ! 


— * .v — 1/2 


e 't 


1 + 


2 z 


v — 1/2 


dt. 


(taking out 2 t/z as a factor). 

We expand (1 + t/2z) v_1/2 by the binomial theorem to obtain 


(11.125) 


K v (z) = 


y (v -l)i 

'2 z (v — ! r = 0 r!(v — r — j)l 


( 2 z)~ 


-t f +r-V2 dL 


(11.126) 


1 For v = 0 the integral diverges logarithmically in agreement with the 

logarithmic divergence of K 0 (z) (Section 11.5). 
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Term-by-term integration (valid for asymptotic series) yields the desired 
asymptotic expansion of K v (z). 


K v (z ) 




(4v 2 - l 2 ) (4v 2 - 1 2 )(4 v 2 - 3 2 ) 

1 !8z + 2!(8z) 2 + ” 


(11.127) 


Although the integral of Eq. 1 1. 122, integrating along the real axis, was conver- 
gent only for — 7i/2 < argz < n/2, Eq. 11.127 may be extended to — 3n/2 < 
argz < 3n/2. Considered as an infinite series, Eq. 11.127 is actually divergent. 2 
However, this series is asymptotic in the sense that for large enough z, K v (z) 
may be approximated to any fixed degree of accuracy. (Compare Section 5.10 
for a definition and discussion of asymptotic series.) 

It is convenient to rewrite Eq. 11.127 as 


K v (z)= lf z e~*[P v (iz) + iQMl (1U28) 

where 


<ji - 1)(ji - 9) (fi - l)(n - 9 Kg - 25)(fi - 49) 
’ 2!(8 z) 2 4!(8z) 4 

a 17) t ~ 1 _ it ~ Uit z -Hit ~ 25 > a. 

^ vi; ~l!(8z) 3!(8z) 3 

and 


(11.129a) 

(11.1296) 


ji = 4v 2 . 

It should be noted that although P v (z) of Eq. 11.129a and Q v {z) of Eq. 11.1296 
have alternating signs, the series for P x (iz) and Q v (iz) of Eq. 1 1.128 have all signs 
positive. Finally, for z large, P v dominates. 

Then with the asymptotic form of K v (z), Eq. 11.128, we can obtain expan- 
sions for all other Bessel and hyperbolic Bessel functions by defining relations: 

1. From 


= K v (z) (11.130) 


we have 


2 

/ — exp; 
nz 


I 1 \ n 

Z “ |v + 2 2 




— n < arg z < 2n. 

(11.131) 


2 Our binomial expansion is valid only for t < 2 z and we have integrated ( out 

to infinity. The exponential decrease of the integrand prevents a disaster but 

the resultant series is still only asymptotic, not convergent. By Table 8.3 
z = ao is an essential singularity of the Bessel (and modified Bessel) equations. 
Fuchs’s theorem does not guarantee a convergent series and we do not get a 
convergent series. 
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2. The second Hankel function is just the complex 
conjugate of the first (for real argument), 



• [P v (z) - iQ v (z)l - In < arg z < n. 

(11.132) 

An alternate derivation of the asymptotic behav- 
ior of the Hankel functions appears in Section 7.4 as 
an application of the method of steepest descents. 

3. Since J v (z) is the real part of Hj/^z), 

w-£{ rMm [ z 

- Qv(z ) sin z I’ 

— 7 z < arg z < n. 

(11.133) 

4. The Neumann function is the imaginary part of 

or 



+ CvOOcos z - 
— n < arg z < n. 

(11.134) 

5. Finally, the regular hyperbolic or modified Bessel 
function 7 v (z) is given by 

7 v (z) = i~ v J v (iz) (11.135) 


m = -4=[Pv(«) - i&M, 

yj 2 tiz 


71 71 

-2 <argZ< 2' 


(11.136) 


This completes our determination of the asymptotic expansions. However, it is 
perhaps worth noting the primary characteristics. Apart from the ubiquitous 
z~ 1/2 , J v and N v behave as cosine and sine, respectively. The zeros are almost 
evenly spaced at intervals of n\ the spacing becomes exactly 1 1 in the limit as 
z -» oo. The Hankel functions have been defined to behave like the imaginary 
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exponentials, and the modified Bessel functions, / v and K v , go into the positive 
and negative exponentials. This asymptotic behavior may be sufficient to 
eliminate immediately one of these functions as a solution for a physical 
problem. We should also note that the asymptotic series P v (z) and <2 v (z), Eqs. 
11.129a and h, terminate for v = ±1/2, ±3/2, . . . and become polynomials 
(in negative powers of z). For these special values of v the asymptotic approxima- 
tions become exact solutions. 

It is of some interest to consider the accuracy of the asymptotic forms, 
taking just the first term, for example (Fig. 11.12), 

,,U37) 

Clearly, the condition for the validity of Eq. 11.137 is that the sine term be 
negligible; that is 

8x»4 n 2 - 1. (11.138) 

For n or v > 1 the asymptotic region may be far out. 

As pointed out in Section 11.3, the asymptotic forms may be used to evaluate 
the various Wronskian formulas (compare Exercise 11.6.3). 

Numerical Evaluation 

When a program in a large high-speed computing machine calls for one of 
the Bessel or modified Bessel functions, the programmer has two alternatives: 
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to store all the Bessel functions and tell the computer how to locate the required 
value or to instruct the computer to simply calculate the needed value. The 
first alternative would be fairly slow and would place unreasonable demands 
on the storage capacity. Thus our programmer adopts the “compute it your- 
self” alternative. 

The computation of J n (x) using the recurrence relation, Eq. 11.10, is discussed 
in Section 11.1. For N„, /„, and K n the preferred methods are the series if x is 
small and the asymptotic forms (with many terms in the series of negative 
powers) if x is large. The criteria of large and small may vary as shown in 
Table 11.2. 


TABLE 11.2 Equations for the Computation of Neumann 
and the Modified Bessel Functions 



Power Series 

Asymptotic Series 

K(x) 

Eq. 11.63, x<4 

Eq. 11.134, 

x > 4 

Ux) 

Eq. 11.112, x < 12 or < n 

Eq. 11.136, 

x > 12 and > n 

K n (x) 

Eq. 11.119, x<l 

Eq. 11.127, 

x > 1 


In actual practice, it is found convenient to limit the series (power or asymptotic) com- 
putation of N n (x ) and K n {x ) to n — 0, 1. Then N„(x), n > 2 is computed using the recurrence 
relation, Eq. 11.10. K n (x), n > 2 is computed using the recurrence relations of Exercise 
1 1.5.7. I n (x) could be handled this way, if desired, but direct application of the power series 
or asymptotic series is feasible for all values of n and x. 


EXERCISES 


11.6.1 In checking the normalization of the integral representation of K v (z) (Eq. 1 1.122), 
we assumed that I v (z) was not present. How do we know that the integral repre- 
sentation (Eq. 11.122) does not yield K v (z) + e/ v (z) with £ ^ 0? 

11.6.2 (a) Show that 


y(z) = z v 


e~ z \t 2 - 1 y-' l2 dt 


satisfies the modified Bessel equation, provided the contour is chosen so that 

e~ zt (t 2 - l) v+1/2 

has the same value at the initial and final points of the contour. 

(b) Verify that the contours shown in Fig. 1 1.13 are suitable for this problem. 



FIG. 11.13 Modified Bessel function contours 
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1 1 .6.3 Use the asymptotic expansions to verify the following Wronskian formulas: 

, . _ , . T , . r , w / . 2 sin V7r 

(a) J v (x)J_ v _!(x) 4- J_ v (x)J v+1 (x) = : , 

7tx 

(b) J y (x)N v+i (x) - 7 v+1 (x)N v (x) = 

nx 

(c) JJx)H t v 2 _\ (x) - J„^(x)Hi 2 Hx) = -; 2 -, 

mx 

(d) h(x)K:(x) - K(x)K„(x) = A 

(e) / v (x)K v+1 (x) + / v+1 (x)K v (x)A 

11.6.4 From the asymptotic form of K v ( z ) Eq. 11.127, derive the asymptotic form of 
H^\z) 9 Eq. 11.131. Note particularly the phase, (v 4- j)n/2. 

11.6.5 Stokes’s method. 

(a) Replace the Bessel function in Bessel’s equation by x _1/2 y(x) and show that 
y(x) satisfies 

y"(x) + ^1 - V x 2 = 0. 

(b) Develop a power-series solution with negative powers of x starting with 
the assumed form 

00 

y(x) = e ix y a„x~". 

n = 0 

Determine the recurrence relation giving a n+i in terms of a n . Check your 
result against the asymptotic series, Eq. 11.131. 

(c) From the results of Section 7.4 determine the initial coefficient, a 0 . 

1 1 .6.6 Calculate the first 15 partial sums of P Q (x ) and Q 0 (x), Eqs. 1 1.129a and 1 1.129b. 
Let x vary from 4 to 10 in unit steps. Determine the number of terms to be retained 
for maximum accuracy and the accuracy achieved as a function of x. Specifically, 
how small may x be without raising the error above 3 x 1 0 ' 6 ? 

ANS . x min = 6. 

11.6.7 (a) Using the asymptotic series (partial sums) P Q {x) and (2 0 M determined in 

Exercise 11.6.6, write a function subprogram FCT(X) that will calculate 
J 0 (x),x real, for x >x min . 

(b) Test your function by comparing it with the J 0 (x ) (tables or computer 
library subroutine) for x = x min (10)x min 4- 10. 

Note. A more accurate and perhaps simpler asymptotic form for J 0 (x) is given 
in AMS-55, Eq. 9.4.3. 


1 1 .7 SPHERICAL BESSEL FUNCTIONS 

When the Helmholtz equation is separated in spherical coordinates the 
radial equation has the form 

r 2 — y + 2 r— + [fcV — n{n 4- 1 )]R = 0. 
dr z dr 


(11.139) 



SPHERICAL BESSEL FUNCTIONS 623 


This is Eq. 2.91 of Section 2.6. The parameter k enters from the original Helm- 
holtz equation while n(n + 1) is a separation constant. From the behavior of 
the polar angle function (Legendre’s equation, Sections 8.5 and 12.7), the 
separation constant must have this form, with n a non-negative integer. Equa- 
tion 11.139 has the virtue of being self-adjoint but clearly it is not Bessel’s 
equation. However, if we substitute 


R(kr) = 


Z(kr) 

(fcr)" 1 ' 2 ’ 


Equation 11.139 becomes 


2 d 2 Z dZ 
r -d^ + , -d7 + 




z = o, 


(11.140) 


which is Bessel’s equation. Z is a Bessel function of order n + j (n an integer). 
Because of the importance of spherical coordinates, this combination, 


Z n +i /2 (kr) 
(kr) il2 ’ 


occurs quite often. 


Definitions 

It is convenient to label these functions spherical Bessel functions with the 
following defining equations 


j n (x) = 


2x 


'n + l/2 


W, 


(11.141) 


n n (x) = 2 (x) = (- 

= j n (x) + in„(x), 

hl n 2) (x) = J^ c H n+ 1 / 2 W =JJx) - injx). 

These spherical Bessel functions (Figs. 11.14 and 11.15) can be expressed in 
series form by using the series (Eq. 11.5) for J n , replacing n with n 4- j. 


J n + 1/2( X ) ~ X 


(-D S 


2s + n+lf2 


+ n + i)! 

Using the Legendre duplication formula, 

z\(z + ^)! = 2“ 2z ^ 1 7i 1/2 (2z + 1)!, 


(11.142) 


(11.143) 


This is possible because cos (n + j)n = 0. 
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FIG. 11.14 Spherical Bessel functions 



FIG. 11.15 Spherical Neumann functions 
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we have 


(— l) s 2 2s+2n+1 (s + n)! l x 
M ’ yj 2x s = o n 1/2 (2s + 2n + l)!s! \2 

_ o„..n £ (-lf(s + n)l 2s 
S %s\(2s + 2n + 1)! ' 


2s + n+ 1/2 


(11.144) 


Now N n + 1/2 (x) = (— l)" +1 J_„_ 1/2 (x) and from Eq. 11.5 we find that 


J-n-m(x) = I 


(- 1 1 


o s!(s - n - i)! \2 


2s— n- 1/2 


This yields 




(-If 




The Legendre duplication formula can be used again to give 


(11.145) 


(11.146) 


, , (-1)" +1 £ ( — l) s (s — n)\ 2s 

s!(2j _ 2 „>! * ■ 


(11.147) 


These seFies forms, Eqs. 11.144 and 11.147, are useful in three ways: (1) limiting 
values as x -► 0, (2) closed form representations for n = 0, and, as an extension 
of this, (3) an indication that the spherical Bessel functions are closely related 
to sine and cosine. 

For the special case n — 0 we find from Eq. 1 1.144 


jo(x) ■■ 


f (~D S x 2s 
h b(2s + l)! 

sinx 


(11.148) 


whereas for n 0 Eq. 11.147 yields 


n 0 (x) = 


cosx 


(11.149) 


From the definition of the spherical Hankel functions (Eq. 11.141), 
^ ( 0 1} (x) = -(sinx — icosx) = —~e ix 


h ( 0 2) (x) = -(sinx + i cosx) = —e xx . 

X X 


(11.150) 


Equations 1 1. 148 and 1 1. 149 suggest expressing the spherical Bessel functions 
as combinations of sine and cosine. The appropriate combinations can be 
developed from the power-series solution, Eqs. 11.144 and 11.147, but this 
approach is awkward. Actually the trigonometric forms are already available 
as the asymptotic expansion of Section 11.6. From Eqs. 11.131 and 11.1 29a 
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mi\ l2 (z) 


= (- 0 " 


{-^w + 1/2 (^) + iQn + 1/2 (^) }* 


(11.151) 


Now P„ +1/2 and are polynomials. This means that Eq. 11.151 is mathe- 

matically exact, not simply an asymptotic approximation. We obtain 


TO = ( “iT +1 - I 

Z s=0 


f (2n + 2s ) ! ! 
s !(8z) s (2n - 2s)!! 


z s % sl(2zf (n - s)\' 


(11.152) 


Often a factor ( — if = (e~ iltl2 f will be combined with the e h to give e il: ^ nni2} . 
For 2 real j n {z ) is the real part of this, n„(z) the imaginary part, and h[ 2) (z) the 
complex conjugate. 

Specifically, 


hp(x) = e ,x 




h ( 2 l) (x) = e ix 




(11.153a) 

(11.153ft) 


and so on. 


j iW 


sinx 


cosx 

X 




3 

— yCOSX, 
X 


«l(x) 


cosx sinx 
x z X 


n 2 (x) 




3 . 

cosx ysmx, 

x 2 


(11.154) 


(11.155) 


Limiting Values 

For x <$c l, 2 Eqs. 11.144 and 11.147 yield 


./„(*) 


2 V. 


-x - 


{2n 4- 1)! (2n + 1)!! 


n„(x)«^iP^x— 


2" ( — 2n)! 

(2n)! ^ — t 


(11.156) 


(11.157) 


2"n! 


= -(2 n - l)!!x - " -1 . 


2 The condition that the second term in the series be negligible compared to 
the first is actually jc <<c 2[(2 n + 2)(2« + 3 )/(n + 1)] 1/2 for j n (x). 
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The transformation of factorials in the expressions for n n (x) employs Exercise 
10.1.3. The limiting values of the spherical Hankel functions go as ±in n (x). 

The asymptotic values of j n ,n n , h {2 \ and h ( „ l) may be obtained from the Bessel 
asymptotic forms, Section 11.6. We find 


Jn(x) ~ 


n„(x) 


h' n "(x) 


h { n\x) ~ 



(-if 


= (- 0 - 


9 i(x~nn/2) 


i" + i — = (0- 

X X 


X 

i(x-nitj2) 


(11.158) 

(11.159) 
(11.160a) 
(11.1606) 


The condition for these spherical Bessel forms is that x » n(n + l)/2. From 
these asymptotic values we see that j„(x) and n n (x) are appropriate for a de- 
scription of standing spherical waves ; h^ix) and h ( n 2) {x) correspond to traveling 
spherical waves. If the time dependence for the traveling waves is taken to be 
e~ l<a \ then fe^ 1) (x) yields an outgoing traveling spherical wave, h i2) (x) an incoming 
wave. Radiation theory in electromagnetism and scattering theory in quantum 
mechanics provide many applications. 


Recurrence Relations 

The recurrence relations to which we now turn provide a convenient way 
of developing the higher-order spherical Bessel functions. These recurrence 
relations may be derived from the series, but as with the modified Bessel func- 
tions, it is easier to substitute into the known recurrence relations (Eqs. 11.10 
and 11.12). This gives 

fn-i(x) + /„+ i(x) = *^±Af n (x), (11.161) 

X 

nf„- i(x) - (n + l)/„+i(x) = (2n + l)/„'(x). (1 1.162) 

Rearranging these relations (or substituting into Eqs. 11.15 and 11.17), we 
obtain 


£[*" +1 /„(x)] = x" + V„-i(x), (11.163) 

£[x-”/„(x)] = -x-"/ n+1 (x). (11.164) 

Here f n may represent j n , n n , h^, or h (2) . 

The specific forms, Eqs. 11.154 and 11.155, may also be readily obtained from 
Eq. 11.164. 

By mathematical induction we may establish the Rayleigh formulas 
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;„(*) = (- 1)V 




n„(x) = 






( A \n / p ix \ 

-V) (— ) 

( a \« / P ~ ix \ 

~) — ). 

x ax J y x J 


(11.165) 

(11.166) 


(11.167) 


Numerical Computation 

The spherical Bessel and modified Bessel functions may be computed using 
the same techniques described in Sections 11.1 and 11.6 or evaluating the 
Bessel functions. For j n (x) and / tt (x) 3 it is convenient to use Eq. 11.161 and 
Exercise 11.7.18 and work downward , as is done for J n (x). Normalization is 
accomplished by comparing with the known forms of j 0 {x) and i 0 (x\ Eq. 11.15 
and Exercise 11.7.15. For n n (x) and k n (x), Eq. 11.161 and Exercise 11.7.19 are 
used again, but this time working upward , starting with the known forms of 
n 0 (4 n^x), fe 0 (x), and k^x), Eq. 11.155 and Exercise 11.7.17. 


Orthogonality 

We may take the orthogonality integral for the ordinary Bessel functions 
(Eq. 1 1.50), 

J J '(^ v ?a^ Jv ( y a '’‘ I a) pdp = y[ J v+i( a vp)] 2<5 w (11.168) 

and substitute in the expression for j n to obtain 

| j*(^n P ^jn(* nq ^jp 2 dp = y[i„ + i(a„ P )] 2 ^ P ,. (11.169) 

Here oc np and a nq are roots of j n . 

This represents orthogonality with respect to the roots of the Bessel func- 
tions. An illustration of this sort of orthogonality is provided later in this 
section by the problem of a particle in a sphere. Equation 11.170 guarantees 
orthogonality of the wave functions j n (r) for fixed n . (If n varies, the spherical 
harmonic will provide orthogonality.) 


EXAMPLE 11.7.1. Particle in a Sphere 

An illustration of the use of the spherical Bessel functions is provided by the 
problem of a quantum mechanical particle in a sphere of radius a. Quantum 
theory requires that the wave function \jj, describing our particle, satisfy 


3 The spherical modified Bessel functions, i n (x) and k n (x), are defined in 
Exercise 11.7,15. 
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h 2 

-x“V 2 ^ = £^, (11.170) 

2m 


and the boundary conditions (1) \jy(r < a) remains finite, (2) \l/{a) = 0. This 
corresponds to a potential V = 0, r < a, and F = oo, r > a. Here h is Planck’s 
constant (divided by 2n), m, the mass of our particle, and £, its energy. Let us 
determine the minimum value of the energy for which our wave equation has an 
acceptable solution. Equation 11.170 is just Helmholtz’s equation with a radial 
part (compare Section 2.6 for separation of variables): 


d 2 R 2dR 
dr 2 r dr + 



n(n + 1) 


R = 0, 


with k 2 = 2 mE/h 2 . Hence by Eq. 11.139, with n = 0, 


(11.171) 


R = Aj 0 (kr) + Bn 0 (kr). 

We choose the index n = 0, for any angular dependence would raise the energy. 
The spherical Neumann function is rejected because of its divergent behavior at 
the origin. Technically, the spherical Neumann function n 0 is a Green’s function 
satisfying Green’s equation and not satisfying the Schrbdinger wave equation 
at the origin. To satisfy the second boundary condition (for all angles), we require 

yjlrnE 

k a — — — - — a = a, 
h 


where a is a root of j 0 , that is,y 0 (a) = 0. This has the effect of limiting the allow- 
able energies to a certain discrete set or, in other words, application of boundary 
condition (2) quantizes the energy E. The smallest a is the first zero of j 0 , 


a = n 


and 


k 2 h 2 _ h 2 
2 ma 2 %ma 2 ’ 


(11.173) 


which means that for any finite sphere the particle will have a positive minimum 
or zero-point energy. This is an illustration of the Heisenberg uncertainty 
principle. 

In solid state physics, astrophysics, and other areas of physics we may wish 
to know how many different solutions (energy states) correspond to energies 
less than or equal to some fixed energy E 0 . For a cubic volume (Exercise 2.6.5) 
the problem is fairly simple. The considerably more difficult spherical case is 
worked out by R. H. Lambert, Am. J. Phys. 36, 417, 1169 (1968). 

Another form, orthogonality with respect to the indices, may be written as 



jjx)jj.x)dx = 0, 


m n, m, n > 0. 


(11.174) 


The proof is left as Exercise 11.7.10. Ifm = n (compare Exercise 1 1.7.1 1), we have 
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U„(X }] 2 dx = - 


2 h 4 - 1 


(11.175) 


Most physical applications of orthogonal Bessel and spherical Bessel func- 
tions involve orthogonality with varying roots and an interval [0, a], Eqs. 
11.168 and 11.169. Orthogonality with varying index, Eq. 11.174, is mainly a 
mathematical curiosity. 

The spherical Bessel functions will enter again in connection with spherical 
waves, but further consideration is postponed until the corresponding angular 
functions, the Legendre functions, have been introduced. 


EXERCISES 

11.7.1 Show that if 




it automatically equals 


(~i r 




n— 1/2 


(X). 


1 1 .7.2 Derive the trigonometric-polynomial forms of j n (z) and n„(z). 4 
(a) 

z V 2 J S %(2s)'.(2z) 2s (n - 2.s)! 




(-l) 5 (« + 2s + 1)! 


, 1 ( nn\ v- 

+ -cos { z I ) . , , 

z V 2 J s=o (2s -f l)!(2z) 2s+1 (« — 2s — 1)! 

... , , (-1)" +I ( nn\ l ^ ] (— l) s (n + 2s)! 

(b) n„(z) = i — cos z + — ) y ' - ' 


2 ) (2s)!(2z) 2s (n - 2s)! 


, (- 1)" +1 . / , B 71 

-f — sin z H — — 

^ V 2 


[(« — 1 )/2] 

I 


( — lf s (n + 2s + 1)? 

(2s + l)!(2z) 2s+l (« — 2s — 1)! 

1 1 .7.3 Use the integral representation of J v (x), 

J - ,x> ‘ (f)' £ 

to show that the spherical Bessel functions j n (x) are expressible in terms of 
trigonometric functions; that is, for example, 

. , , sinx 
J oM = , 


j dx) = 


sin x cos x 


1 1 .7.4 (a) Derive the recurrence relations 


4 The upper limit on the summation [u/2] means the largest integer that does 
not exceed n/2. 
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11.7.5 


11.7.6 


11.7.7 


11.7.8 


11.7.9 


11.7.10 


11.7.11 


11.7.12 


fn-l(x) + /„+ lW = ^ ^ ~ 1 


«/„-i(x) - (n + l)/„ +1 (x) = (2« -I- 1 )/„'(x), 

satisfied by the spherical Bessel functions, y M (x), n„(x), and h ( n 2) (x). 

(b) Show, from these two recurrence relations, that the spherical Bessel func- 
tion /„(x) satisfies the differential equation 

x 2 /„"(x) + 2 x/„'(x) + [x 2 - n(n 4- l)]/ n (x) = 0. 


Prove by mathematical induction that 


j„w = (-irx” 



for n an arbitrary nonnegative integer. 


From the discussion of orthogonality of the spherical Bessel functions, show 
that a Wronskian relation for j n (x) and n M (x) is 

j n (x)n'„(x) -j'(x)n n (x) = 

x z 


Verify 


tt n l \x)h?\x) - K l V(x)K 2 \x) = 


Verify Poisson’s integral representation of the spherical Bessel function, 


j ” iz) 2 " +1 n\ 


cos(zcosd)sin 2 ” +1 OdO. 


Show that 


x n — v z 


Derive Eq. 11.174: 


f jjx)j„(x)dx = 0 , m ^ " 

m, n > 0. 

•/ — 00 ’ 


Derive Eq. 11.175: 

J = 

Set up the orthogonality integral for j L (kr) in a sphere of radius R with the 
boundary condition 

= o. 

The result is used in classifying electromagnetic radiation according to its 
angular momentum. 


1 1 .7.1 3 The Fresnel integrals (Fig. 11.16) occurring in diffraction theory are given by 
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FIG. 11.16 Fresnel integrals 


y(t) = j* sin (v 2 )dv. 

Show that these integrals may be expanded in series of spherical Bessel functions, 

f*S 00 

x(s) = \ j^{u)u 1,2 du = s 112 £ j 2n (s\ 

Jo H = 0 

f*s 00 

y(s) = 2 j 0 {u)u il2 du = s 112 £ 7 2 „+i(4 
Jo « = 0 

Hint. To establish the equality of the integral and the sum, you may wish to 
work with their derivatives. The spherical Bessel analogs of Eqs. 1 1.12 and 11.14 
are helpful. 


11.7.14 


A hollow sphere of radius a (Helmholtz resonator) contains standing sound 
waves. Find the minimum frequency of oscillation in terms of the radius a and 
the velocity of sound v. The sound waves satisfy the wave equation 




and the boundary condition 

# n 

= 0, r — a. 

dr 

This is a Neumann boundary condition. Example 1 1.7.1 has the same differential 
equation but with a Dirichlet boundary condition. 
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FIG. 11.17 Spherical modified Bessel functions 


ANS. 


= 0.3313 v/a. 


^raax = 3.018a. 

1 1 .7.1 5 Defining the spherical modified Bessel functions (Fig. 11.17) by 

HrT 


i«(x) = 

M*) = 


'2x 


l 


71 + 1/2 


Ax 


show that 


i 0 (x) = 


71 + 1/2 


sinhx 


(*)> 


k 0 (x) = 

Note that the numerical factors in the definitions of i„ and k„ are not identical. 

1 1 .7.1 6 (a) Show that the parity of i n (x) is ( — 1)". 

(b) Show that k„(x) has no definite parity. 

11.7.17 Show that the spherical modified Bessel functions satisfy the following relations : 

(a) i„(x) = r n j„(ix), 

kjx) = -m^m 
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11.7.18 


11.7.19 


11.7.20 


11.7.21 


(b) i n+1 (x) = x n — -(x n i„), 

ax 


K+i(x) = ~x n — (x n K), 
ax 


(c) i„(x) = x " {— 

\xdx 

k„(x) = (-l) n x" 


? sinh x 


d 

xdx 


Show that the recurrence relations for i n (x ) and k n (x) are 
(a) 


in- i(*) - '„+i(*) = 2 -- + - l i„(x), 


(b) 


ni^i(x) + (n 4- l)i„ + i(x) = (2 n + l)i’ n (x), 
K-i(x) - k n+ j(x) = —k„(x). 


»k«-i(x) + (n+ l)fe„+i(x) = -(2 n + l)/cl(x). 

Derive the limiting values for the spherical modified Bessel functions 
v « 

(a) i n (x) j 


k n (x) j 


(2 n + 1)!! 
(2n - 1)!! 


X <3C 1. 


(b) ijx) > 


2x 


k„(x) ~ , 

x 


x n(n 4- l)/2. 


Show that the Wronskian of the spherical modified Bessel functions is given by 


i n (x)K(x) - i' n (x)k n (x ) = 


1 


A quantum particle is trapped in a "‘square” well of radius a. The Schrodinger 
equation potential is 


F(r) = 


-K>, 

o, 


0 < r < a 
r > a. 


The particle’s energy E is negative (an eigenvalue). 

(a) Show that the radial part of the wave function is given by ji(k r r) for 0 < 
r < a and k l (k 2 r ) for r > a. (We require that i//(0) and i/doo) be finite.) Here 
k\ — 2 M(E -f V 0 )/h 2 , k\ — —2 ME/h 2 , and / is the angular momentum ( n 
in Eq. 11.139). 

(b) The boundary condition at r = a is that the wave function i^(r) and its 
first derivative be continuous. Show that this means 


ji( k i r ) 




dr 


k,(k 2 r ) 


k/(k 2 r) 
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This equation determines the energy eigenvalues. 

Note. This is a generalization of Example 9.1.2. 

1 1 .7.22 The quantum mechanical radial wave function for a scattered wave is given by 

_ sin (kr + 6 0 ) 

^ k ~ kr 

where k is the wave number, k = yj2mE]h , and d 0 is the scattering phase shift. 
Show that the normalization integral is 

| dr = ~S{k - k '). 

Hint. You can use a sine representation of the Dirac delta function. See Exercise 
15.3.8. 


1 1 .7.23 Derive the spherical Bessel function closure relation 


2a 2 

n 


'*00 

j„{ar)j„{br)r 2 dr = <5 (a - b). 

0 


Note. An interesting derivation involving Fourier transforms, the Rayleigh 
plane wave expansion, and spherical harmonics has been given by P. Ugincius, 
Am. J. Phys ., 40, 1690 (1972). 


11 .7.24 (a) Write a subroutine that will generate the spherical Bessel functions, j n (x), 
that is, will generate the numerical value of j n (x ) given x and n. 

Note. One possibility is to use the explicit known forms of j 0 and j x and 
to develop the higher index j n by repeated application of the recurrence 
relation. 

(b) Check your subroutine by an independent calculation such as Eq. 11.153. 
If possible, compare the machine time needed for this check with the time 
required for your subroutine. 


11.7.25 The wave function of a particle in a sphere (Example 11.7.1) with angular 

momentum l is if/(r,0,(p) = Aj x F/”(d, <p). The Y"\0,(p) is a spherical 

harmonic, described in Section 12.6. From the boundary condition [j/(a. (7, cp) = 0 
fJlME \ 

1 1 ~ ' — 0 calculate the 10 lowest energy states. Disregard the m 


or Ji 


- 


h 


degeneracy (21 -f 1 values of m for each choice of /). Check your results against 
AMS-55, Table 10.6. 

Hint. You can use your spherical Bessel subroutine and a root-finding sub- 
routine. 

Check values, jjiocis) = 0, 


oc n = 4.4934 
a 2i = 5.7635 
a 02 = 6.2832. 


11.7.26 Let Example 11.7.1 be modified so that the potential is a finite V 0 outside 
(r > a). 

(a) For E < V 0 show that 


i p out (r,0,q>) - 


^2M(V n - E) r 


h 
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(b) The new boundary conditions to be satisfied at r = a are 

tp,Ja,0,(p) = if/ om (a, 0, <p) 

j- 0 ’<P) = j- 'I'** (a, 0, <p) 
dr dr 


or 


1 #in = _i_ #ou, 

<Ain dr r = « lAout ^ r = a ' 

For / = 0 show that the boundary condition at r — a leads to 


m = k 


|cot ka — 



+ k! 



= 0, 

where k = yflME/h and k' = ^/2M(V 0 — ~E)/fe. 

(c) With a = lft 2 /Me 2 (Bohr radius) and F 0 - 4 Me 4 /2h 2 , compute the possible 
bound states, (0 < E < V 0 ). 

Hint. Call a root-finding subroutine after you know the approximate 
location of the roots of 


f(E\ (0,F o ). 

(d) Show that when a = 1 h 2 /Me 2 the minimum value of F 0 for which a bound 
state exists is V 0 = 2A614Me 4 /2h 2 . 


1 1 .7.27 In some nuclear stripping reactions the differential cross section is proportional 
to (ji(x) 2 , where / is the angular momentum. The location of the maximum on 
the curve of experimental data permits a determination of /, if the location of 
the (first) maximum of ;,(x) is known. Compute the location of the first maximum 
of;‘i(x),; 2 (x), andj 3 (x). 

Note. For better accuracy look for the first zero of jftx). Why is this more accurate 
than direct location of the maximum? 
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12 LEGENDRE 
FUNCTIONS 


12,1 GENERATING FUNCTION 

Legendre polynomials may appear in many different mathematical and phys- 
ical situations: (1) They may originate as solutions of the Legendre differential 
equation which we have already encountered in the separation of variables 
(Section 2.6) for Laplace’s equation, Helmholtz’s equation, and similar differ- 
ential equations in spherical polar coordinates. (2) They may enter as a con- 
sequence of a Rodrigues' formula (Section 12.4). (3) They may be constructed 
as a consequence of demanding a complete, orthogonal set of functions over 
the interval [—1,1] (Gram-Schmidt orthogonalization, Section 9.3). (4) In 
quantum mechanics they (really the spherical harmonics, Sections 12.6 and 12.7) 
represent angular momentum eigenfunctions. (5) They may be generated by a 
generating function. We introduce Legendre polynomials here by way of a 
generating function. The development of the various properties and related 
functions is shown schematically in Fig. 12.1. 

Physical Basis — Electrostatics 

As with Bessel functions, it is convenient to introduce the Legendre poly- 
nomials by means of a generating function. However, a direct physical inter- 
pretation is possible. Consider an electric charge q placed on the 2 -axis at z = a. 
As shown in Fig. 12.2, the electrostatic potential of charge q is 

< P=-r*— •— (SI units). (12.1) 

47C£ 0 r l 

Our problem is to express the electrostatic potential in terms of the spherical 
polar coordinates r and 0 (the coordinate cp is absent because of symmetry 
about the z-axis). Using the law of cosines, we obtain 

q> = — (r I 2 + a 2 — 2ar cos Q)~ XI2 . (12.2) 

47T£o 

Legendre Polynomials 

Consider the case of r > a or, more precisely, r 2 > | a 2 — 2arcosfl|. The 
radical may be expanded by the binomial series to give 

I PJcos 0) Y, (12.3) 

4na 0 r„% \rj 
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638 LEGENDRE FUNCTIONS 



a series of powers of (a/r) with the coefficient of the nth power denoted by 
P n ( cos 0). The P„ are the Legendre polynomials (Fig. 12.3) and may be defined 
by 

oo 

g(t,x) = (1 - 2 xt + t 2 y m = £ P„{x)t n , |f| < 1. 

« — 0 


(12.4) 
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FIG. 12.2 Electrostatic potential. 
Charge q displaced from origin 


P n (x) 



FIG. 12.3 Legendre polynomials, P 2 (x), 
F 3 (-x), P 4 (x), and P 5 (x) 


This is equivalent to equating the right-hand sides of Eqs. 12.2 and 12.3 with 
cos0 replaced by x and a/r replaced by t. Equation 12.4 is our generating 
function. In the next section it is shown that |P n (cos0)| < 1, which means that 
the series expansion (Eq. 12.4) is convergent for |*| < l. 1 Indeed, the series is 
convergent for \t\ — 1 except for |x| — 1. 

Actually since Eq. 12.4 defines the Legendre polynomials, P n (x), convergence 
of the series is not necessary. We can still obtain the explicit values of the 
polynomials and develop useful relations between them even when the series 
diverges. However, the property of convergence is convenient in order to be 
able to exploit the properties of power series (Section 5.7). 

In physical applications Eq. 12.4 often appears in the vector form 
1 1 00 fr \ n 

I i = — I ( — ■ ) n(cos 0\ (12.4 a) 

| r i ~ r 2 \ r >n %\r > J 

where 


x Note that the series in Eq. 12.3 is convergent for r > a even though the 
binomial expansion involved is valid only for r > (a 2 + 2 ar) l/2 , cos 6 = — 1 . 
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m! for l ri l > 'l r2 l’ 

r< = jr 2 | J 
and 

r > = l r 2ll , I I I I 

f for r 2 > i - ! . 

r< = KilJ 

Using the binomial theorem (Section 5.6) and Exercise 10.1.15, we expand 
the generating function as follows: 


(1-2 xt + t 2 y 112 



MI 

2 2 "(n !) 2 


(2 xt - t 2 f 



n~0 


(In — 1)!! 
(2 n)\\ 


(2 xt - t 2 ) n . 


(12.5) 


For the first few Legendre polynomials, say, P 0 , P u and P 2 , we need the co- 
efficients of f°, t 1 , and t 2 . These powers of t appear only in the terms n = 0, 1, and 
2 and hence we may limit our attention to the first three terms of the infinite 
series : 


0! 


j(2xt — f 2 ) 0 H- 


2 '° - 2 ! (2 xt - t 2 )' + ^^(2 xt - t 1 ) 1 


2 °( 0!) 2V 7 2 2 (1 !) 2 

= It 0 + xt 1 + f~x 2 - | \t 2 + Ot 3 . 


2 4 ( 2 !) : 


Then, from Eq. 12.4 (and uniqueness of power series) 


P 0 (x)=l, A(x) = x, F 2 (x) = |x 2 -i. 

We repeat this limited development in a vector framework later in this section. 

In employing a general treatment, we find that the binomial expansion of the 
(2 xt — t 2 f factor yields the double series 


n ! 




(2x) n ~ k t k 


= f f (-If (M: 

L ’ 2 2n n\k\(n - k)\ 


(12.6) 


(2xr*t- 


n=0 fe= o 


From Eq. 5.64 of Section 5.4 (rearranging the order of summation), Eq. 12.6 
becomes 


(1 — 2 xt + t 2 ) 


' 1/2 


oo ln/2] 

= I L(- 

n=0k=0 


If 


(In — 2k ) ! 


2 2 "- 2 '‘/c!(n-fc)!(rc-2!v-)! 


' (2x)” 


(12.7) 


with the variable t independent of the index k. 2 Now, equating our two power 


2 [«/2] — n /2 f°r n even, (n — l)/2 for n odd. 
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series (Eqs. 12.4 and 12.7) term by term, we have 3 


2 n kl(n - k) ! (n - 2k)\ 


( 12 . 8 ) 


Linear Electric Multipoles 

Returning to the electric charge on the z-axis, we demonstrate the usefulness 
and power of the generating function by adding a charge — q at z = — a, as 
shown in Fig. 12.4. The potential becomes 

= ( 12 . 9 ) 


4ne 0 \r x r 2 
and by using the law of cosines, we have 


<P 


Mi 1 


1 — 2 / — j cos 0 + / - 


- 1/2 


1 4- 2 1 - | cos 0 + 
r 


- 1/2 


(r > a). 


Clearly, the second radical is like the first, except that a has been replaced by — a. 
Then, using Eq. 12.4, we obtain 


<p=4^-r!;^os0)("y'- £ p„(cos «H-ir 

t +7tfco' n = 0 


n = 0 




P ! (cos e) (- ) + P 3 (cos 6) C-Y + • • 


V' 


( 12 . 10 ) 


The first term (and dominant term for r » a) is 

— 2qg i\(cosfl) 
^ 4n£ 0 r 2 


(12.11) 


which is the usual electric dipole potential. Here 2 aq is the dipole moment 
(Fig. 12.4). 

This analysis may be extended by placing additional charges on the z-axis so 
that the P x term, as well as the P 0 (monopole) term, is canceled. For instance, 
charges of q at z = a and z = —a, — 2q at z = 0 give rise to a potential whose 
series expansion starts with P 2 (cos0). This is a linear electric quadrupole. Two 
linear quadrupoles may be placed so that the quadrupole term is canceled, but 
the P 3 , the octupole term, survives. 

Vector Expansion 

We consider the electrostatic potential produced by a distributed charge 

P(r 2 ): 


3 Equation 12.8 starts with x n . By changing the index, we can transform it 
into a series that starts with x° for n even and x l for n odd. These ascending 
series are given as hypergeometric functions in Eqs. 13.104 and 13.105, 
Section 13.5. 
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FIG. 12.4 Electric dipole 


(p{r x) = • - - f | - p - r - ~ - { dT 2 - (12.12a) 

47t£ 0 J |r, r 2 | 

This expression has already been encountered in Sections 1.15 and 8.7. Taking 
the denominator of the integrand, using first the law of cosines and then a 
binomial expansion, yields 


= (rf - 2r, • r 2 + r\) 


2l-l/2 


1 + - 


1 + 


2rj r 2 


+ -f 

n 


- 1/2 


ri-r 2 lr| 3(r,T 2 ) 2 


. _ : 

2rj 2 


for r { > r 2 


+ (9 


(12.12b) 


(For r x = 1, r 2 = t, and r : *r 2 = xr Eq. 12.12b reduces to the generating func- 
tion, Eq. 12.4.) 

The first term in the square bracket, 1, yields a potential 


tp o(G) 


1 1 

4tC£ 0 r l 


P(t 2 )d-c 2 . 


(12.12c) 


The integral is just the total charge. This part of the total potential is an electric 
monopole. 

The second term yields 


< ? > i( r i) 


1 r_T 

4ne 0 rf 


r 2 p(r 2 )dz 2 . 


( 12.12 d) 


Here the charge p(r 2 ) is weighted by a moment arm r 2 . We have an electric 
dipole potential. For atomic or nuclear states of definite parity p(r 2 ) is an even 
function and the dipole integral is identically zero. 
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The last two terms, both of order (r 2 /r : ) 2 , may be handled by using cartesian 
coodinates 


3 3 

( r l -r 2 ) 2 = Z X U x 2i Z X lj X 2y 

i= 1 j = 1 

Rearranging variables to keep the x 2 ’s inside the integral yields 

1 1 3 f 

Vibi) = Z^uXij [ 3 x 2i x 2j - S ij r z 2 ]p(r 2 )dt 2 . ( 1 2.12e) 

This is the electric quadrupole term. We note that the square bracket in the 
integrand forms a symmetric, zero trace tensor. 

A general electrostatic multipole expansion can also be developed by using 
Eq. 12.12 a for the potential q>{) \) and replacing — r 2 |) by Green’s func- 

tion, Eq. 16. 169. This yields the potential (p( i^) as a (double) series of the spherical 
harmonics Y i m (0 1 ,<p 1 ) and Y l m {9 2 ,(p 2 Y 

Before leaving multipole fields, perhaps we should emphasize three points. 
First, an electric (or magnetic) multipole has an absolute significance only if all 
lower-order terms vanish. For instance, the potential of one charge q at z = a 
was expanded in a series of Legendre polynomials. Although we may refer to 
the P x (cos 0) term in this expansion as a dipole term, it should be remembered 
that this term exists only because of our choice of coordinates. We actually have 
a monopole, P 0 (cos 0 ). 

Second, in physical systems we do not encounter pure multipoles. As an 
example, the potential of the finite dipole (q at z = a, — q at z = — a) contained 
a P 3 (cos0) term. These higher-order terms may be eliminated by shrinking the 
multipole to a point multipole, in this case keeping the product qa constant 
(a 0, q -» oo ) to maintain the same dipole moment. 

Third, the multipole theory is not restricted to electrical phenomena. Plane- 
tary configurations are described in terms of mass multipoles. Sections 12.3 and 
12.5. Gravitational radiation depends on the time behavior of mass quadrupoles. 
(The gravitational radiation field is a tensor field. The radiation units, gravitons, 
carry two units of angular momentum.) 

It might also be noted that a multipole expansion is actually a decomposition 
into the irreducible representations of the rotation group (Section 4.10). 


Extension to Ultraspherical Polynomials 

The generating function, g{t,x\ used here is actually a special case of a more 
general generating function, 


1 

(1 - 2x£ + t 2 f 


Z c’(*r- 


n = 0 


(12.13) 


The coefficients C { n a) (x) are the ultraspherical polynomials (proportional to the 
Gegenbauer polynomials). For a = 1/2 this equation reduces to Eq, 12.4; that 
is; C^ 1/2) (x) = P n (x). The cases a = 0 and a = 1 are considered in Chapter 13 in 
connection with the Chebyshev polynomials. 
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EXERCISES 

12 . 1.1 Develop the electrostatic potential for the array of charges shown. This is a 
linear electric quadrupole (Fig. 12.5). 



12 . 1.2 Calculate the electrostatic potential of the array of charges shown (Fig. 12.6). 

Here is an example of two equal but oppositely directed dipoles. The dipole 
contributions cancel. The octupole terms do not cancel. 

-q +2 q -2 q q , 

• • e • • ► 

z = - 2a ~a a 2 a 

FIG. 12.6 Linear electric octopole 


1 2 . 1 .3 Show that the electrostatic potential produced by a charge q at z = a for r < a is 


<p(r) = 


4-71£qCI , 


P n (cos 0). 


12 . 1.4 Using E — — \cp, determine the components of the electric field corresponding 
to the (pure) electric dipole potential 

. , 2aqP l (cosO) 

(p{r)= - - 2 - 

4n£ 0 r 

Here it is assumed that r » a. 


E r = + 


4 aq cos 0 
4n £ 0 r 3 

2aq sin 0 
4n£ 0 r 3 


E m = 0. 


1 2 . 1 .5 A point electric dipole of strength p (1) is placed at z = a; a second point electric 
dipole of equal but opposite strength is at the origin. Keeping the product p (1) a 
constant, let a -► 0. Show that this results in a point electric quadrupole. 

Hint. Exercise 12.2.5 (when proved) will be helpful. 


12.1 .6 A point charge q is in the interior of a hollow conducting sphere of radius r 0 . 

The charge q is displaced a distance a from the center of the sphere. If the con- 
ducting sphere is grounded, show that the potential in the interior produced by 
q and the distributed induced charge is the same as that produced by q and its 
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image charge q'. The image charge is at a distance a = r%ja from the center, 
colinear with q and the origin (Fig. 12.7). 

Hint. Calculate the electrostatic potential for a <r 0 < a'. Show that the potential 
vanishes for r = r 0 if we take q' — — qrja. 



12.1.7 Prove that 


P„(cos fl) = (-!)' 


dz n 



Hint. Compare the Legendre polynomial expansion of the generating function 
(a, Fig. 12.2 -► Az) with a Taylor series expansion of 1/r, where z dependence of 
r changes from z to z — Az (Fig. 12.8). 



FIG. 12.8 


1 2 . 1 .8 By differentiation and direct substitution of the series form, Eq. 12.8, show that 
P n (x) satisfies the Legendre differential equation. Note that there is no restriction 
upon x. We may have any x, — oo < x < oo and indeed any z in the entire finite 
complex plane. 


1 2.1 .9 The Chebyshev polynomials (type II) are generated by (Eq. 13.62, Section 13.3) 


1 

1 — 2 xt + t 2 


00 


z 

n = 0 


u n (x)r. 


Using the techniques of Section 5.4 for 
representation of U n (x). 

ANS. 


transforming series, develop a series 


[n/2] 

ujx) = z (-If 


(n — k ) ! 
id(n- 2k)\ 


(2x) n ~ 2k . 


12.2 RECURRENCE RELATIONS AND SPECIAL 
PROPERTIES 

Recurrence Relations 

The Legendre polynomial generating function provides a convenient way of 
deriving the recurrence relations 1 and some special properties. If our generating 


We can also apply the explicit series form (Eq. 12.8) directly. 
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TABLE 12.1 Legendre Polynomials 


P 0 (x) = 1 
P l (x) = x 
P 2 (x) = K3x 2 - 1) 

P 3 (x) = |(5x 3 - 3x) 

P 4 (x) = i(35x 4 — 30x 2 + 3) 

P 5 (x) = |(63x 5 - 70 a 3 + 15a) 

P 6 (a) = ^(231x 6 - 315 a 4 + 105a 2 - 5) 

P 7 (x) = ^(429 a 7 - 693a 5 + 315a 3 - 35a) 

P a (x) = T js ( 6435a 8 - 12012a 6 + 6930a 4 - 1260a 2 + 35) 


function (Eq. 12.4) is differentiated with respect to f, we obtain 

Ml?} X M = f„p (x)t"- 1 

dt “(1-2 xt + t 2 ) 3 ' 2 n% " ’ 

By substituting Eq. 12.4 into this and rearranging terms, we have 

00 00 

(1 - 2 Xt + t 2 ) X n p n (x)t”~ l + (t - x) £ P„(x)t - = 0. 

n=0 


(12.14) 


(12.15) 


The left-hand side is a power series in t. Since this power series vanishes for all 
values of t , we may put the coefficient of each power of t equal to zero, that is, 
our power series is unique (Section 5.7). This may be done easily by separating 
the individual summations and using distinctive summation indices, 


OO 00 CO 

X mP„,(x)t m_1 - X 2 nxP„(x)t n + Z sP s (x)P +1 

m=° n-0 s-0 (12.16) 

+ | P s (x)t s+1 - f xP H (x)t” = 0. 

s — 0 n = 0 

Now letting m — n + 1, s = n — 1, we find 

(2n + 1 )xP n {x) = (n + l)P n+1 (x) + nP II _ 1 (x), n = 1,2, 3, (12.17) 

This is another three-term recurrence relation similar to (but not identical to) the 
recurrence relation for Bessel functions. With this recurrence relation we may 
easily construct the higher Legendre polynomials. If we take n = 1 and insert the 
easily found values of P 0 (x ) and Pi(x) (Exercise 12.1.7 or Eq. 12.8), we obtain 

3 xP^x) = 2 P 2 (x) + P 0 (x) (12.18) 

or 

P 2 (x) = j(3x 2 ~ 1). (12.19) 

This process may be continued indefinitely. The first few Legendre polynomials 
are listed in Table 12.1. 
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Cumbersome as it may appear at first, this technique is actually more efficient 
for a large digital computer than is direct evaluation of the series (Eq. 12.8). For 
greater stability (to avoid undue accumulation and magnification of round off 
error), Eq. 12.17 is rewritten as 

P n+l (x) = 2 xP n (x) - P n ^(x) - [xP n (x) - + 1). (12.17a) 

One starts with P 0 {x) — 1, P x (x) = x, and computes the numerical values of all 
the P n {x) for a given value of x up to the desired P v (x). The values of P n (x), 
0 < n < N are available as a fringe benefit. 


Differential Equations 

More information about the behavior of the Legendre polynomials can be 
obtained if we now differentiate Eq. 12.4 with respect to x. This gives 


dg{U x) 
8x 


(1 - 2 xt + t 2 f 2 


= 1 p;(xr 


( 12 . 20 ) 


or 


(1 - 2 xt + t 2 ) X P;(x)t " - t X P n (x)t n = 0. 


( 12 . 21 ) 


As before, the coefficient of each power of t is set equal to zero and we obtain 


PU iW + p;_ i(x) = 2 xp;(x) + P„(x). (12.22) 

A more useful relation may be found by differentiating Eq. 12.17 with respect 
to x and multiplying by 2. To this we add (2 n + 1) times Eq. 12.22, canceling the 
P n ' term. The result is 

PU i(x) - p;- i(x) = (2 n + 1 )P n (x). (12.23) 

From Eqs. 12.22 and 12.23 numerous additional equations may be de- 
veloped, 2 including 

P'+ iW = (n + 1 )P„(x) + xP;(x), (12.24) 

Pn-i(x) = -nP n (x) + xP;(x), (12.25) 


2 Using the equation number in parentheses to denote the entire equation, 
we may write the derivations as 

2-v-(12.17) + (2 n + 1) -(12.22) => (12.23) 
ax 

i {(12.22) + (12.23)} =>(12.24) 

r {(12.22) — (12.23)} =>(12.25) 

(12.24)^,_, + x- (12.25) =>(12.26) 

—(12.26) + n • (12.25) => (12.28) 
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(1 - x 2 )P’(x) = nP„^(x) - nxP„(x), (12.26) 

(1 - x 2 )P„'(x) = (n + 1 )xP n {x) - (n + l)P„ +1 (x). (12.27) 

By differentiating Eq. 12.26 and using Eq. 12.25 to eliminate P^ x (x), we find that 
P n (x) satisfies the linear, second-order differential equation 

(1 — x 2 )P”(x) — 2 xP'(x) 4- n(n + l)P„(x) = 0. (12.28) 

The previous equations, Eqs. 12.22 to 12.27, are all first-order differential equa- 
tions, but with polynomials of two different indices. The price for having all 
indices alike is a second-order differential equation. Equation 1 2.28 is Legendre’s 
differential equation. We now see that the polynomials P n (x) generated by the 
expansion of (1 — 2 xt -f t 2 )~ 1/2 satisfy Legendre’s equation which, of course, is 
why they are called Legendre polynomials. 

In Eq. 12.28 differentiation is with respect to x{x = cosd). Frequently, we 
encounter Legendre’s equation expressed in terms of differentiation with respect 
to 0, 

—nTn(^ e d ^ ) + n{n + , > P " (cos ^ = °- < 12 - 29 ) 

sin 0 dO y dO J 


Special Values 

Our generating function provides still more information about the Legendre 
polynomials. If we set x = 1, Eq. 12.4 becomes 


1 = 1 
(1-2 1 + t 2 ) 112 1 - t 

CO 

-I*-. 


(12.30) 


using a binomial expansion. But 


1 00 

c -ax+Hig. - 

Comparing the two series expansions (uniqueness of power series, Section 5.7), 
we have 

P n ( 1) - 1. (12.31) 

If we let x = — 1, the same sort of analysis shows that 

P M (-l) = (-iy. (12.32) 

For obtaining these results, we find that the generating function is more con- 
venient than the explicit series form. 

If we take x = 0, using the binomial expansion 

(1 + t 2 )- 1/2 = 1 - $t 2 + ft 4 + • • • + (-1)" 1 * -- ' — t 2n + • • ■ , (12.33) 
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we have 3 


^2„(0) = (-ir— 


• • (2 n- 1) 
2"n ! 


(-i r 


( 2 n - 1 ) ! ! 

(2/i)!! 


P 2n+1 ( 0) = 0, /i = 0,l,2,.... 


These results also follow from Eq. 12.8 by inspection. 


(12.34) 

(12.35) 


Parity 

Some of these results are special cases of the parity property of the Legendre 
polynomials. We refer once more to Eq. 12.4. If we replace x by — x and t by —t, 
the generating function is unchanged. Hence 

g(t,x) = g(—t, -x) 

= [1— 2( — 1)(— *) + ( — tfr 112 

CO 

= (12.36) 

n — 0 
00 

= I P n (x)t". 

n = 0 

Comparing these two series, we have 

Pn(-x) = (-lTP n (x); (12.37) 

that is, the polynomial functions are odd or even (with respect to x = 0, 0 — n/2) 
according to whether the index n is odd or even. This is the parity 4 or reflection 
property that plays such an important role in quantum mechanics. For central 
forces the index n is a measure of the orbital angular momentum, thus linking 
parity and orbital angular momentum. 

The reader will see this parity property confirmed by the series solution and 
for the special values tabulated in Table 12.1. It might also be noted that Eq. 
12.37 may be predicted by inspection of Eq. 12.17, the recurrence relation. 
Specifically, if P n ^ x (x) and xP n (x) are even, then P n+i {x) must be even. 

Upper and Lower Bounds for P n (cos 9 ) 

Finally, in addition to these results, our generating function enables us to set 
an upper limit on |P n (cos0)|. We have 


3 The double factorial notation is defined in Section 10.1. 

(2 «)!! = 2*4*6 • • • {In). (2 n - 1)!! = 1*3*5 * • * (2 n - 1). 

4 In spherical polar coordinates the inversion of the point ( r,6,(p ) through 
the origin is accomplished by the transformation [r -> r, 6 -* n — 6, and 
(p (p ± 7i]. Then, cos 6 -*■ cos(7i — 9) = —cos 9, corresponding to x -> — x 
(compare Exercise 2.5.8). 
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TABLE 12.2 Comparison of Generating 
Function plus Recurrence Relations and Series 
Expansion, Eq. 12.8 


Application 

Generating function 
recurrence relations 
Eqs. 12.4, 12.17, and 
12.22 

Series 

Eq. 12.8 

Table 12.1 

numerical value 

Computer choice 

More direct 

Derivation of differential 
equation, Eq. 12.27 

Moderately involved 

Verification easy. 
Derivation requires 
Clairvoyance 

P n ( 1), Eq. 12.30 

Easy 

Awkward 

P„(0), Eq. 12.34 

Easy 

By inspection 

Parity, Eq. 12.36 

Easy 

By inspection 

Bounds, Eq. 12.38 

Fairly easy 

Awkward 


(1 - 2t cos 0 + t 2 )~ 112 = (1 - te ie )~ il2 ( 1 - tc'T 1/2 

= (1 + \te iB + it 2 e 2i0 + ■•■) ( 12 . 38 ) 

•(1 + ifcT" + tf 2 e" 2f * + ••■), 


with all coefficients positive. Our Legendre polynomial, P n (cos 0), still the co- 
efficient of t n , may now be written as a sum of terms of the form 

aje^o + e -i„,o )/2 = fl|n cosh im() 

(12.39a) 

= a m cos mO 

with all the a m positive. Then 


P n (cos 0) = Y a m cos 

m = 0 or 1 


(12.396) 


This series (Eq. 12.396) is clearly a maximum when 0 = 0 and cos mi) = 1. But 
for x = cos 0=1, Eq. 12.31 shows that P n U) = 1. Therefore 

|P„(cos 0)| <?„(!)=!. (12.39c) 

A fringe benefit of Eq. 12.396 is that it shows that our Legendre polynomial 
is a linear combination of cos mO. This means that the Legendre polynomials 
form a complete set for any functions that may be expanded by a Fourier cosine 
series (Section 14.1) over the interval (0, n). 

In this section various useful properties of the Legendre polynomials are 
derived from the generating function, Eq. 12.4. The explicit series representa- 
tion, Eq. 12.8, offers an alternate and sometimes superior approach. Table 12.2 
offers a comparison of the two approaches. 
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EXERCISES 


1 2.2.1 Given the series 

olq + a 2 cos2 0 + ot 4 cos 4 0 ~b cos 6 0 = a 0 P 0 + a 2 P 2 + a 4 P 4 + a 6 P b . 

Express the coefficients as a column vector a and the coefficients a { as a 
column vector a and determine the matrices A and B such that 

Aa = a and Ba = a. 

Check your computation by showing that AB = 1 (unit matrix). Repeat for 
the odd case 

apcos0 + a 3 cos 3 6 + a 5 cos 5 0 + a 7 cos 7 0 — a x P x + a 3 P 3 + a 5 P 5 + a 1 P 1 . 
Note . P n (cosO) and cos"d are tabulated in terms of each other in AMS-55. 


12.2 2 


By differentiating the generating function, g{t,x), with respect to t , multiplying 
by 2 1, and then adding g(t,x), show that 


1 -t 2 

(1-2 tx + t 2 ) 312 


I (2n + l)P w (x)C. 

n = 0 


This result is useful in calculating the charge induced on a grounded metal 
sphere by a point charge q. 


12.2.3 (a) Derive Eq. 12.27 

(1 - x 2 )P„'(x) = (n+ 1 )xP„(x) - (n + 1 )P„+,(x). 

(b) Write out the relation of Eq. 12.27 to preceding equations in symbolic 
form analogous to the symbolic forms for Eqs. 12.23 to 12.26. 


1 2.2.4 A point electric octupole may be constructed by placing a point electric quadru- 
pole (pole strength p {2) in the 2-direction) at z = a and an equal but opposite 
point electric quadrupole at z = 0 and then letting a -> 0, subject to p (2) a = 
constant. Find the electrostatic potential corresponding to a point electric 
octupole. Show from the construction of the point electric octupole that the 
corresponding potential may be obtained by differentiating the point quadru- 
pole potential. 


1 2.2.5 Operating in spherical polar coordinates , show that 

d r P H (cos0) 1 P n+l (cosO) 

dzl r" +1 J 1 ; r n+2 ■ 

This is the key step in the mathematical argument that the derivative of one 
multipole leads to the next higher multipole. 

Hint . Compare Exercise 2.5.12. 


12.2.6 From 

P L {cosO) = ^7 - 2r cos 0 + t 2 )~ 1/2 | I=0 

show that 

Pl( D=l> Pl(~ 1) = (-1) L 

12.2.7 Prove that 

f;(1)= £ Pn(x) l* =i= i" (n + 4 
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1 2.2.8 Show that P„(cos 6) = ( — 1)"P„( — cos 0) by use of the recurrence relation relating 
P n , P n+1 , and and your knowledge of P 0 and P { . 

12.2.9 From Eq. 12.38 write out the coefficient of t 2 in terms of cos n(K n < 2. This 
coefficient is P 2 (cos0). 

12.2.10 Write a program that will generate the coefficients a s in the polynomial form 
of the Legendre polynomial, 

PJx) = y a s x\ 

s = 0 

1 2.2.1 1 (a) Calculate P 10 (x) over the range [0, 1] and plot your results. 

(b) Calculate precise (at least to five decimal places) values of the five positive 
roots of P 10 (x). Compare your values with the values listed in AMS-55 
(Table 25.4). 

Hint. See Appendix 1 for root-finding techniques. 

1 2.2.1 2 (a) Calculate the largest root of P n (x) for n = 2(1)50. 

(b) Develop an approximation for the largest root from the hypergeometric 
representation of P n (x) (Section 13.4) and compare your values from part 
(a) with your hypergeometric approximation. Compare also with the 
values listed in AMS-55 (Table 25.4). 

12.2.13 (a) From Exercise 12.2.1 and AMS-55 (Table 22.9) develop the 6 x 6 matrix 

B that will transform a series of even order Legendre polynomials through 
P 10 (x) into a power series <x 2n x 2n . 

(b) Calculate A as B _1 . Check the elements of A against the values listed in 
AMS-55 (Table 22.9). 

(c) By using matrix multiplication, transform some even power series 
Z" = 0«2„^ 2 " into a Legendre series. 

12.2.14 Write a subroutine that will transform a finite power series ^ =0 fl M x” into a 

Legendre series ]T£L 0 Use the recurrence relation Eq. 12.17 and follow 

the technique outlined in Section 13.3 for a Chebyshev series. 


12.3 ORTHOGONALITY 

Legendre’s differential equation (12.28) may be written in the form 

^[d - x 2 )P;(x)] + n(n + 1 )P„(x) = 0, (12.40) 


showing clearly that it is self-adjoint. Subject to satisfying certain boundary 
conditions, then, it is known that the solutions P n (x) will be orthogonal. Re- 
peating the Sturm-Liouville analysis (Section 9.2), we multiply Eq. 12.40 by 
P m (x) and subtract the corresponding equation with m and n interchanged. 
Integrating from — 1 to + 1, we get 


f { Pm(x) £ [(i “* 2)p - (x) ] - p " w iT ( i ~ x2)p "' {x rt\ dx 

= [m{m + 1) - n(n + 1)] f P„(x)PJx)dx. 


(12.41) 
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Integrating by parts, the integrated part vanishing because of the factor ( 1 — x 2 ), 1 
we have 


[m(m + 1) 


"("+!)] P n (x)P m (x)dx = 0. 


(12.42) 


Then for m 7= ft 


P n (x)PJx)dx = 0, 2 


-1 


P n (cos 9)P m ( cos 0 ) sin 0 dO = 0, 


(1243) 


showing that P n (x) and P m (x) are orthogonal for the interval [—1,1]. This 
orthogonality may also be demonstrated quite readily by using Rodrigues' 
definition of P n (x) (compare Section 12.4, Exercise 12.4.2). 

We shall need to evaluate the integral (Eq. 12.42) when n = m . Certainly 
it is no longer zero. From our generating function 


(1 - 2 tx + t 2 )' 1 


Z P„{x)t- 


n = 0 


(12.44) 


Integrating from x = — 1 to x = -F 1, we have 


dx 

1 1 — 2 tx -F t 2 


= Zt 2 ” 

n = 0 


[P„(x)] 2 c/x; 


the cross terms in the series vanish by means ofEq. 12.43. Using y = 
we obtain 


(12.45) 
— 2 tx + f 2 , 


dx 


1 — 2 tx + r 


1 

2 1 


J(l-r ) 2 


y t 



(12.46) 


Expanding this in a power series (Exercise 5.4.1) gives us 



t 



00 


= 2 z 

n=0 


t 2n 

2n + 1 


(12.47) 


Since our power-series representation is known to be unique, we must have 


J [PnMfdx 


2 

2n + 1 


(12.48) 


1 This of course is why the limits were chosen as — 1 and + 1 . 

2 In Section 9.4 such integrals are intepreted as inner products in a linear 
vector (function) space. Alternate notations are 

PJx)PJx)dx = <P„(x)\PJx)) 

= (P„(x), PJx)). 

The < > form, popularized by Dirac, is common in physics literature. The 
( ) form is more common in the mathematics literature. 
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We shall return to this result in Section 12.6 when we construct the orthonormal 
spherical harmonics. 

Expansion of Functions, Legendre Series 

In addition to orthogonality, the Sturm-Liouville theory shows that the 
Legendre polynomials form a complete set. Let us assume, then, that the series 

00 

Z a„P H (x) = /(*), (12.49) 

n = 0 

in the sense of convergence in the mean (Section 9.4) in the interval [—1, 1]. 
This demands that f(x) and f\x) be at least sectionally continuous in this 
interval. The coefficients a n are found by multiplying the series by P„,(x) and 
integrating term by term. Using the orthogonality property expressed in Eqs. 
12.43 and 12.48, we obtain 

2mVr a '" = f(x)PJx)dx. (12.50) 

We replace the variable of integration x by t and the index m by n. Then, 
substituting into Eq. 12.49, we have 

f(t)P n (t)d t y„(x). (12.51) 

This expansion in a series of Legendre polynomials is usually referred to as a 
Legendre series. 3 4 Its properties are quite similar to the more familiar Fourier 
series (Chapter 14). In particular, we can use the orthogonality property, 
(Eq. 12.43), to show that the series is unique. 

On a more abstract (and more powerful) level, Eq. 12.51 gives the repre- 
sentation of f(x) in the linear vector space of Legendre polynomials (a Hilbert 
space, Section 9.4). 

From the viewpoint of integral transforms (Chapter 15) Eq. 12.50 may be 
considered a finite Legendre transform of f(x). Equation 12.51 is then the 
inverse transform. It may also be interpreted in terms of the projection operators 
of quantum theory. We may take 

m -4-1 P 

PJt)l ]dt 

L J-l 

as an (integral) operator, ready to operate on f(t). [Th e /(f) would go in the 
square bracket as a factor in the integrand.] Then, from Eq. 12.50 

The operator projects out the rath component of the function/. 


3 Note that Eq. 12.50 gives a m as a definite integral, that is, a number for a 
given f(x). 

4 The dependent variables are arbitrary. Here x came from the x in while 
t is a dummy variable of integration. 


fix) = X 


\ 2n+l 
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Equation 12.3, which leads directly to the generating function definition of 
Legendre polynomials, is a Legendre expansion of 1/r t . This Legendre expansion 
of l/r l or 1 jr l2 appears in several exercises of Section 12.8. Going beyond a 
simple Coulomb field, the 1 /r 12 is often replaced by a potential F(|r, — r 2 |) 
and the solution of the problem is again effected by a Legendre expansion. 
In nuclear physics calculations the coefficients a n may be' computed (by a 
computing machine) up through a 100 . 

The Legendre series, Eq. 12.49, has been treated as a known function f(x) 
that we arbitrarily chose to expand in a series of Legendre polynomials. Some- 
times the origin and nature of the Legendre series is different. In the next 
examples we consider unknown functions we know can be represented by a 
Legendre series because of the differential equation the unknown functions 
satisfy. As before, the problem is to determine the unknown coefficients in the 
series expansion. Here, however, the coefficients are not found by Eq. 12.50. 
Rather, they are determined by demanding that the Legendre series match a 
known solution at a boundary. These are boundary value problems. 

EXAMPLE 12.3.1 Earth’s Gravitational Field 


An example of a Legendre series is provided by the description of the earth’s 
gravitational potential U (for exterior points), neglecting azimuthal effects. 
With 

R = equatorial radius 
- 6378.1 ± 0.1 km 

~ = 62.494 ± 0.001 km 2 /sec 2 . 


we write 


U(r, 6) = 


GM 

R 


R 

r 


I a„ 

n = 2 



p n ( COS 0) 


(12.52) 


a Legendre series. Artificial satellite motions have shown that 
a 2 = (1,082,635 ± 11) x 10~ 9 
a 3 = ( — 2,531 ± 7) x 10” 9 . 

This is the famous pear-shaped deformation of the earth, 

a 4 = (- 1,600 ± 12) x 10“ 9 

Other coefficients have been computed through n = 20. The reader might note 
that P t is omitted, since it would represent a displacement and not a deformation. 

More recent satellite data permit a determination of the longitudinal depen- 
dence of the earth’s gravitational field. Such dependence may be described by a 
Laplace series (Section 12.6). 
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FIG. 12.9 Conducting sphere in a uni- 
form field 



EXAMPLE 12.3.2 Sphere in a Uniform Field 

Another illustration of the use of Legendre polynomials is provided by the 
problem of a neutral conducting sphere (radius r 0 ) placed in a (previously) 
uniform electric field (Fig. 12.9). The problem is to find the new, perturbed, 
electrostatic potential. Calling the electrostatic potential F 5 , 

\ 2 V=0 >' (12.53) 

Laplace’s equation. We select spherical polar coordinates because of the spheri- 
cal shape of the conductor. (This will simplify the application of the boundary 
condition at the surface of the conductor.) Separating variables and glancing at 
Table 8.1 if necessary, we can write the unknown potential F(r, 0) as a linear 
combination of solutions. 


V(r, 0) = £ a n r"P n { cos 0) + f b„ ^0^. (12.54) 

n=0 n=0 r 

No (^-dependence appears because of the axial symmetry of our problem. (The 
center of the conducting sphere is taken as the origin and the z-axis is oriented 
parallel to the original uniform field.) 

It might be noted here that n is an integer, because only for integral n is the 0 
dependence well behaved at cos0~ ±1. For nonintegral n the solutions of 
Legendre’s equation diverge at the ends of the interval [ — 1, 1], the poles 0 = 0, 
7i of the sphere (compare Example 5.2.4 and Exercises 5.2.15 and 8.5.5). It is for 
this same reason that the second solution of Legendre’s equation, Q n , is also 
excluded. 

Now we turn to our (Dirichlet) boundary conditions to determine the 
unknown a„’s and b n ' s of our series solution, Eq. 12.54. If the original unperturbed 
electrostatic field is E Q , we require, as one boundary condition, 


V(r —> go) = —E 0 z— —E 0 r cos 0 

= — E 0 rP 1 (cosO). 


(12.55) 


5 It should be emphasized that this is not a presentation of a Legendre series 
expansion of a known V (cos 9). Here we are back to boundary value problems. 
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Since our Legendre series is unique, we may equate coefficients of P n {cosO) in 
Eq. 12.54 (r -> oo) and Eq. 12.55 to obtain 


a n = 0, n > 1, 
a x = —E 0 . 


(12.56) 


If a n ^ 0 for n > 1, these terms would dominate at large r and the boundary 
condition (Eq. 12.55) could not be satisfied. 

As a second boundary condition, we may choose the conducting sphere and 
the plane 0 = n/2 to be at zero potential, which means that Eq. 12.54 now 
becomes 


V(r = r 0 ) = a 0 + - - + - E 0 r 0 \ (cos 0) + X b„ 


(12.57) 


= 0 . 


In order that this may hold for all values of 0 , each coefficient of P„(cos 0) must 
vanish. 6 Hence 


whereas 


a 0 = b 0 = 0, 7 
b„ = 0, n > 2, 


(12.58) 


b i = £ 0 ''o- 


(12.59) 


The electrostatic potential (outside the sphere) is then 

V= -EorP^cosO) + ^op^cosO) 

= -Eor/Vcosd/l -0Y 


(12.60) 


In Section 1.15 it was shown that a solution of Laplace’s equation that 
satisfied the boundary conditions over the entire boundary was unique. The 
electrostatic potential V, as given by Eq. 12.60, is a solution of Laplace’s equa- 
tion. It satisfies our boundary conditions and therefore is the solution of 
Laplace’s equation for this problem. 

It may further be shown (Exercise 12.3.13) that there is an induced surface 
charge density 



dV 

dr 


r=r 0 


3e 0 E 0 co$0 


(12.61) 


6 Again, this is equivalent to saying that a series expansion in Legendre poly- 
nomials (or any complete orthogonal set) is unique. 

7 The coefficient of P 0 is a 0 + bjr 0 . We set b 0 = 0 (and therefore a 0 = 0 also), 
since there is no net charge on the sphere. If there is a net charge a, then 

*o^0. 
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on the surface of the sphere and an induced electric dipole moment (Exercise 
123.13) 

p = 4nrl£ 0 E 0 . (1262) 

gXAMPLE 123.3 Electrostatic Potential of a Ring of Charge 

As a further example, consider the electrostatic potential produced by a con- 
ducting ring carrying a total electric charge q (Fig. 12.10). From electrostatics 
(and Section 1.14) the potential i jj satisfies Laplace’s equation. Separating 
variables in spherical polar coordinates (compare Table 8.1), we obtain 

°o fJ n 

'l'(r,0)= £ a B - 7 TiT p »(cosO), r > a. (12.63 a) 

n~0 r 

Here a is the radius of the ring that is assumed to be in the 0 = n/2 plane. There is 
no <p (azimuthal) dependence because of the cylindrical symmetry of the system. 



FIG. 12.10 Charged, conducting ring 


The terms with positive exponent radial dependence have been rejected since 
the potential must have an asymptotic behavior 




q A 9 

47te 0 r ’ 


r » a. 


(12.63 b) 


The problem is to determine the coefficients a n in Eq. 12.63u. This may be done 
by evaluating \j/(r , 0) at 0 = 0, r = z, and comparing with an independent cal- 
culation of the potential from Coulomb’s law. In effect, we are using a boundary 
condition along the z-axis. From Coulomb’s law (with all charge equidistant), 


<AM)- 


q . i 

47T£ 0 (z 2 + a 2 ) 1/2 ’ 


0 = 0 
r = z, 


q 

4n& 0 z 


K-i ) s 


s = 0 


(25)! 
2 2s (s !) 2 



z > a. 


(12.63 c) 


The last step uses the result of Exercise 10.1.15. Now, Eq. 12.63a evaluated at 
6 = 0, r = z (with E„(l) = 1), yields 
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•AM) = !>,,-£ r> »■ = *■ (12.63*/) 

fl — 0 2 

Comparing Eqs. 12.63c and 12.63 d, we get a n = 0 for n odd. Setting n = 2s, we 
have 


a = -i ( jV : 

2s 47i£ 0 l 1 2 2s (s!) 2 ’ 

and our electrostatic potential 0) is given by 


(12.63c) 




K-ir 


(2s)! 


P2s(cos0), r>a. (12.63/) 


4n£ 0 r Jt'o v 2 2s (s !) 2 
The magnetic analog of this problem appears in Section 12.5 — Example 12.5.1. 


EXERCISES 


12.3.1 You have constructed a set of orthogonal functions by the Gram-Schmidt 
process (Section 9.3), taking w„(x) = x", n = 0, 1, 2, . . . , in increasing order with 
w(x) = 1 and an interval — 1 < x < 1. Prove that the nth such function con* 
structed is proportional to P„(x). 

Hint. Use mathematical induction. 

1 2.3.2 Expand the Dirac delta function in a series of Legendre polynomials, using the 
interval — 1 < x < 1. 


1 2.3.3 Verify the Dirac delta function expansions 

00 2 n + \ 

n = 0 1 

00 2n + 1 

<5(i +x)= y (-i r—^—PAx). 

n = 0 * 

These expressions appear in a resolution of the Rayleigh plane wave expansion 
(Exercise 12.4.7) into incoming and outgoing spherical waves. 

Note. Assume that the entire Dirac delta function is covered when integrating 
over [ — 1, 1]. 


12.3.4 


Neutrons (mass 1) are being scattered by a nucleus of mass A(A > 1). In the 
center of the mass system the scattering is isotropic. Then, in the lab system the 
average of the cosine of the angle of deflection of the neutron is 


<COS (j/) — 


1 f' “ A cos 0 + 1 

2 J 0 (A 2 -f 2 A cost) + 1) 1/2 


sin OdO. 


Show, by expansion of the denominator, that <cos = 


_2_ 

3 A 


12.3.5 

12.3.6 


A particular function f(x) defined over the interval [—1,1] is expanded in 
a Legendre series over this same interval. Show that the expansion is unique. 

A function /(x) is expanded in a Legendre series f(x) = a„P n (x). Show that 

imydx = £ 2 

„=0 2 n + 1 
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This is the Legendre form of the Fourier series Parseval identity. Exercise 
14.4.2 It also illustrates Bessel’s inequality, Eq. 9.72, becoming an equality for 
a complete set. 


1 2.3.7 Derive the recurrence relation 

(1 - x 2 )P„(x) = nP„^(x) - nxPjx) 
from the Legendre polynomial generating function. 

12.3.8 Evaluate JJ P n (x)dx. 

ANS. n = 2s; 1 for s ~ 0, 0 for s > 0, 

n = 2s + 1 ; P 2s (0)/(2s + 2) = (- l) s (2s - 1) ! !/(2s + 2) ! ! 

Hint. Use a recurrence relation to replace P„(x) by derivatives and then integrate 
by inspection! Alternatively, you can integrate the generating function. 


12.3.9 


(a) 


For/(x) = 
show that 


0 < x < 1 
— 1 < x < 0, 


(b) 



dx = 2 f (4 n + 3) 

n = 0 


( 2 n - l )!!] 2 
(2 n + 2) ! ! 


By testing the series, prove that the series is convergent. 


12.3.10 Prove that 

x(l — x 2 )P„'P M ', dx = 0, unless m = n ± 1. 
J-i 

1 2.3.1 1 The amplitude of a scattered wave is given by 


f(Q) = t £ (21 + l)exp[zd,] sind, P,(cosd). 
1 = 0 


Here 0 is the angle of scattering, / the angular momentum, and <5, the phase 
shift produced by the central potential that is doing the scattering. The total 
cross section is <7 tot = j f*(Q)f(0)dQ . Show that 

<7 to t — 47 c£ 2 £ (21 + l)sin 2 ^ ? . 

/=o 

12.3.12 The coincidence counting rate, W(0\ in a gamma -gamma angular correlation 
experiment has the form 

W(0) = la 2 „P 2n (cosO). 

n — 0 

Show that data in the range tt/2 < 0 < % can, in principle, define the function, 
W(6), (and permit a determination of the coefficients a 2n )• This means that 
although data in the range 0 <0< %j2 may be useful as a check, they are not 
essential. 

1 2.3.1 3 A conducting sphere of radius r 0 is placed in an initially uniform electric field, 
E 0 . Show the following: 

(a) The induced surface charge density is 

o = 3fi 0 P 0 cos ^- 

(b) The induced electric dipole moment is 

P = 47Woe 0 £ 0 - 
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The induced electric dipole moment can be calculated either from the 
surface charge [part (a)], or by noting that the final electric field E is the 
result of superimposing a dipole field on the original uniform field. 


12.3.14 A charge q is displaced a distance a along the z-axis from the center of a spherical 
cavity of radius R. 

(a) Show that the electric field averaged over the volume a < r < R is zero. 

(b) Show that the electric field averaged over the volume 0 < r < a is 

E = k£„ = — k — - — r, (SI units) 

4ne 0 a 

- -k^ 

380’ 

where n is the number of such displaced charges per unit volume. This is a 
basic calculation in the polarization of a dielectric. 

Hint . E = — \(p. 

1 2.3.1 5 Determine the electrostatic potential (Legendre expansion) of a circular ring of 
electric charge for r < a. 

1 2.3.1 6 Calculate the electric field produced by the charged conducting ring of Example 
12.3.3 for 

(a) r > a, 

(b) r < a. 


12 . 3.17 As an extension of Example 12.3.3, find the potential i//(r,0) produced by a 
charged conducting disk, Fig. 12.11, for r > a, the radius of the disk. 

The charge density o (on each side of the disk) is 


<r(p) = 


47m(a 2 — p 2 ) 


2p/2’ 


p 2 = x 2 +- >' 2 



FIG. 12.11 Charged, conducting disk 


Hint. The definite integral you get can be evaluated as a beta function, Section 
10.4. 

ANS. Hr, 0) = - q — £(- (-V V 2 ,( cos 0). 

4tc£ 0 r,% 21 + \\rj 

1 2.3.1 8 From the result of Exercise 12.3. 17 calculate the potential of the disk. Since you 
are violating the condition r > a, justify your calculation carefully. 

Hint. You may run into the series given in Exercise 5.2.14. 

12.3.19 The hemisphere defined by r = a, 0 <()< n / 2 has an electrostatic potential 
4- V 0 . The hemisphere r = u, tc/2 < 0 < n has an electrostatic potential — F 0 . 
Show that the potential at interior points is 




12 . 3.20 A conducting spheres of radius a is divided into two electrically separate 
hemispheres by a thin insulating barrier at its equator. The top hemisphere is 
maintained at a potential F 0 , the bottom hemisphere at — V 0 . 

(a) Show that the electrostatic potential exterior to the two hemispheres is 

nr,0) = v 0 £ (— If (4s + 3) ]y - ^ - y| - ; - ; (-Y %+2 p 2 , + ,(cosg). 
s=o (2s + 2) ! ! \rj 

(b) Calculate the electric charge density <r on the outside surface. Note that 
your series diverges at cos 0 = ±1 as you expected from the infinite 
capacitance of this system (zero thickness for the insulating barrier). 

dV 

ANS. a = s 0 E n = — e 0 - - 

or _ 

r — a 

= s 0 v 0 £ (- im* + ^--—r-Pz^osO). 

12 . 3.21 In the notation of Section 9.4 |<p s .) = yj{2s 4- l)/2P v (x), a Legendre polynomial 

is renormalized to unity. Explain how |<p v > {(p s | acts as a projection operator. In 
particular, show that if ] /) = then 

|<&> <<&!/> = 

1 2 . 3.22 Expand x 8 as a Legendre series. Determine the Legendre coefficients from Eq. 
12.50, 

2m + 1 f 1 8 . 

a m = — - — x 8 PJx)dx. 

^ J-i 

Check your values against AMS-55, Table 22.9. This illustrates the expansion 
of a simple function. Actually if f(x) is expressed as a power series, the technique 
of Exercise 12.2.14 is both faster and more accurate. 

Hint. Gaussian quadrature can be used to evaluate the integral. 

12 . 3.23 Calculate and tabulate the electrostatic potential created by a ring of charge, 
Example 12.3.3, for r/a = 1.5(0.5)5.0 and 0 — 0°(15°)90°. Carry terms through 
P 22 (cosO). 

Note. The convergence of your series will be slow for r/a =1.5. Truncating the 
series at P 22 limits you to about a four significant figure accuracy. 

Check value. For r/a — 2.5 and 0 — 60°, \p ~ 0.40212(q/4ne 0 r). 

12 . 3.24 Calculate and tabulate the electrostatic potential created by a charged disk, 
Exercise 12.3.17, for r/a, ~ 1.5(0.5)5.0 and 0 = 0°(15°)90°. Carry terms through 
P 22 ( cos 0). 

Check value. For r/a ~ 2.0 and 0 — 15°, \p = 0.46638 (<y/47T£ 0 r )* 

1 2 . 3.25 Calculate the first five (nonvanishing) coefficients in the Legendre series expan- 
sion of f(x) = 1 — |x| using Eq. 12.51 — numerical integration. Actually these 
coefficients can be obtained in closed form. Compare your coefficients with 
those obtained from Exercise 13.4.4. 
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ANS. a 0 = 0.5000 
a 2 = —0.6250 
a A = 0.1875 
a 6 = -0.1016 
a 8 = 0.0664. 

12.3.26 Calculate and tabulate the exterior electrostatic potential created by the two 
charged hemispheres of Exercise 12.3.20, for r/a = 1.5(O.5)5.Oand0 = 0°(15°)90°. 
Carry terms through P 2 3 (cos 6). 

Check value . For r/a = 2.0 and 6 = 45°, 
F = 0.27066 F 0 . 

12.3.27 (a) Given f(x) = 2.0, |x| < 0.5; 0, 0.5 < |x| < 1.0. Expand f{x ) in a Legendre 

series and calculate the coefficients a n through a so (analytically). 

(b) Evaluate for x = 0.400(0.005)0.600. Plot your results. 

Note. This illustrates the Gibbs phenomenon of Section 14.5 and the danger of 
trying to calculate with a series expansion in the vicinity of a discontinuity. 


12.4 ALTERNATE DEFINITIONS OF LEGENDRE 
POLYNOMIALS 


Rodrigues' Formula 

The series form of the Legendre polynomials (Eq. 12.8) of Section 12.1 may 
be transformed as follows. From Eq. 12.8 


W 2} 

p n (x)= X(-1 y 


(2n — 2r) ! 


2r 


2”r\(n - r) ! (n - 2r)\ 


(12.64) 


For n an integer 


PM = 


M2] 1 

y (-i) r — — 

r=o 2"r !(n — r ) ! 



1 (d V y (— l) r n ! 2 .- 2fj 

2"n ! \dx J r = 0 r !(n — r ) ! 


(12.64a) 


Note the extension of the upper limit. The reader is asked to show in Exercise 
12.4.1 that the additional terms [n/2] + 1 to n in the summation contribute 
nothing. However, the effect of these extra terms is to permit the replacement of 
the new summation by (x 2 — 1)" (binomial theorem once again) to obtain 


< 1165) 

This is Rodrigues' formula. It is useful in proving many of the properties of the 
Legendre polynomials such as orthogonality. A related application is seen in 
Exercise 12.4.3. The Rodrigues definition is extended in Section 12.5 to define 
the associated Legendre functions. In Section 12.7 it is used to identify the orbital 
angular momentum eigenfunctions. 
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Schlaefli Integral 

Rodrigues’ formula provides a means of developing an integral representa- 
tion of P„(z). Using Cauchy’s integral formula (Section 6.4) 

■'w-2 ,1Z66) 

with 


we have 


f(z) = (z 2 - IT, 


(z 2 - 1 r = 


J- njLjr 

2ni I t — z 


dt. 


( 1267 ) 

( 1268 ) 


Differentiating n times with respect to z and multiplying by 1/2 n n ! gives 


pj® = 


1 d n 
2"n ! dz n 


(z 2 - 1)” 


2~" 

2ni 


2 ~1) 
- zf +1 


(12.69) 


with the contour enclosing the point t = z. 

This is the Schlaefli integral. Margenau and Murphy 1 use this to derive the 
recurrence relations we obtained from the generating function. 

The Schlaefli integral may readily be shown to satisfy Legendre’s equation by 
differentiation and direct substitution (Fig. 12.12). We obtain 


(1 - z 2 ) 


d 2 ^ 

dz 2 


2z^~ + n(n + 1 )P, 
dz 


, _ n + 1 Cl I 
" “ 2 n 2ni J dt 


( t 2 - 1)" +1 
(t - z) n+2 


dt. (12.70) 


For integral n our function ( t 2 — 1 ) n+l /(t — z) n+2 is single-valued, and the in- 


l H. Margenau, and G. M. Murphy, The Mathematics of Physics and Chem- 
istry, 2nd ed., Section 3.5. Princeton, N.J. Van Nostrand (1956). 
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tegral around the closed path vanishes. The Schlaefli integral may also be used 
to define P v (z) for nonintegral v integrating around the points t = z, t = 1, but 
not crossing the cut line — 1 to — oo. We could equally well encircle the points 
t = z and t = — 1, but this would lead to nothing new. A contour about t = + 1 
and f = — 1 will lead to a second solution Q v (z), Section 12.10. 


EXERCISES 


12.4.1 


12.4.2 


12.4.3 


Show that each term in the summation 

V ( d Y (~ l) r *d x 2»~2r 
,[iiW r!(« — r) ! 
vanishes (r and n integral). 

Using Rodrigues’ formula, show that the F-(x) are orthogonal and that 


\P„(x)Ydx - — — 

-1 


2 n + 1 

Hint . Use Rodrigues’ formula and integrate by parts. 


Show that Jij x m P n (x)dx = 0 when men. 
Hint . Use Rodrigues’ formula. 


12.4.4 Show that 


f x n P n (x)dx = 


2 n+1 n\n\ 

(2 1 ) ! 


iVote. You are expected to use Rodrigues’ formula and integrate by parts but 
also see if you can get the result from Eq. 12.8 by inspection. 


12.4.5 


12.4.6 


12.4.7 


Show that 

; (2r + 2n -h 1) !(r — n ) ! 

As a generalization of Exercises 12.4.4 and 12.4.5, show that the Legendre 
expansions of x s are 


(a) 


(b) 


. y 2 2 "(4n+ l)(2r)!(r + n)! ^ 
h (2r + 2n + l)!(r — n)\ ’’ 


s = 2 r. 


n = y 2 2n+1 (4 n + 3)(2r + l)!(r + n + 1)! 

„=o (2r + 2n + 3) !(r — n)\ 2n+1 


(x). 


s = 2r + 1. 


A plane wave may be expanded in a series of spherical waves by the Rayleigh 
equation 

e‘ krcosy = ta„j n (kr)P„( cosy). 

n = 0 

Show that = i"(2n +1). 

Hint. 1. Use the orthogonality of the P n to solve for a n j n (kr). 

2. Differentiate n times with respect to (kr) and set r — 0 to eliminate 
the r-dependence. 

3. Evaluate the remaining integral by Exercise 12.4.4. 
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Note. This problem may also be treated by noting that both sides of the equation 
satisfy the Helmholtz equation. The equality can be established by showing that 
the solutions have the same behavior at the origin and also behave alike at large 
distances. A “by inspection” type solution is developed in Section 16.6 using 
Green’s functions. 

12.4.8 Verify the Rayleigh equation of Exercise 12.4.7 by starting with the following 
steps: 

1. Differentiate with respect to (hr) to establish 

S a„Ukr)P n (c os y) = i £ a n j n (kr ) cos yP„(cos y). 

ti n 

2. Use a recurrence relation to replace cos yP n ( cos y) by a linear 
combination of P n _ 1 and P n +i • 

3. Use a recurrence relation to replace ^ by a linear combina- 
tion of /„_! and j H+l . 

12.4.9 From Exercise 12.4.7 show that 

j n (kr) = J* e ikru P n (p)dn. 

This means that (apart from constant factors) the spherical Bessel function 
j„(kr ) is the Fourier transform of the Legendre polynomial P n (/i). 

1 2.4.1 0 The Legendre polynomials and the spherical Bessel functions are related by 

*n 

j n (z) = i(- if e iz “* e P n ( cos 9) sin 9 dO, n = 0, 1, 2, ... . 

Jo 

Verify this relation by transforming the right-hand side into 

cos (z cos 0) sin 2n+1 0 dO 

and using Exercise 11.7.9. 

1 2.4.1 1 By direct evaluation of the Schlaefli integral show that P„(l) = 1. 

12.4.12 Explain why the contour of the Schlaefli integral, Eq. 12.69, is chosen to enclose 
the points t~z and t— 1 when n v, not an integer. 

12.4.13 In numerical work (such as the Gauss-Legendre quadrature of Appendix 2) 
it is useful to establish that P„(x) has n real zeros in the interior of [—1,1]. 
Show that this is so. 

Hint. Rolle’s theorem shows that the first derivative of (x 2 — l) 2n has one zero 
in the interior of [- 1, 1], Extend this argument to the second, third, and ulti- 
mately to the nth derivative. 
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When Helmholtz’s equation is separated in spherical polar coordinates (Sec- 
tion 2.6), one of the separated ordinary differential equations is the associated 
Legendre equation 


1 d 
sin 9 dd 



dP„ cos 6 
d6 



n(n -F 1) 



P„ m ( cos 0) = 0 


(12.71) 
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With x = cos 0, this becomes 

(1 - x 2 )~P:(x) - 2 x^Pn m (x) + n(n + 1) - P?[x) = 0 (12.72) 

Only if the azimuthal separation constant m 2 = 0 do we have Legendre’s equa- 
tion, Eq. 12.28. 

One way of developing the solution of the associated Legendre equation is to 
start with the regular Legendre equation and convert it into the associated 
Legendre equation by using multiple differentiation. We take Legendre’s 
equation 

(1 - x 2 )P; - 2xP; + n(n + 1)P„ = 0, (12.73) 

and with the help of Leibnitz’s formula 1 differentiate m times. The result is 

(1 — x 2 )u" — 2 x(m + 1 )u' + (n — m)(n + m 4- l)u — 0, (12.74) 

where 

Equation 12.74 is not self-adjoint. To put it into self-adjoint form, we replace 
u(x) by 

v(x) = (1 - x 2 ) ml2 u(x) = (1 - x 2 ) m/2 - m £ 4-- (12.76) 

Solving for u and differentiating, we obtain 


u = \v' + 


„ , 2 mxv' , mv 

V + - j + l 

1 — x 1 — X 


(i - x 2 y m/2 , 


m(m -f 2)x 2 v 


(1 - x 2 ) 2 


(12.77) 


■(1 ~x 2 )~ m/2 (12.78) 


Substituting into Eq. 12.74, we find that the new function v satisfies the 
differential equation 


(1 — x 2 )v" — 2xv f + 

which is the associated Legendre equation reducing to Legendre’s equation, as 
it must when m is set equal to zero. Expressed in spherical polar coordinates, the 
associated Legendre equation is 


(n+ l)---^Ljt> = 0, (12.79) 

1 — X^ 


1 Leibnitz’s formula for the nth derivative of a product is 


~iA(x)B(x)-\ = £ 


n\ d n ~ s d s 
A(x)^-3(x), 


(n — 5)! 5 !’ 


a binomial coefficient. 



668 LEGENDRE FUNCTIONS 


1 d 
sin 6 d6 



+ 


n(n + 1) — 



i? = 0. 


(1280) 


Associated Legendre Functions 

The regular solutions, relabeled P™{x\ are 

Jin 

V = P n m (*) = (1 - x 2 r 12 ~ P n {x). (12.81) 

These are the associated Legendre functions. 2 Since the highest power of x in 
P n (x) is x n , we must have m < n (or the m-fold differentiation will drive our func- 
tion to zero). In quantum mechanics the requirement that m <n has the physical 
interpretation that the expectation value of the square of the z-component of the 
angular momentum is less than or equal to the expectation \alue of the square 
of the angular momentum vector L, <L 2 > < <L 2 >. 

From the form of Eq. 12.81 we might expect m to be nonnegative, differen- 
tiating a negative number of times not having been defined. However, if P n (x) is 
expressed by Rodrigues’ formula, this limitation on m is relaxed and we may 
have — n <m <n , negative as well as positive values of m being permitted. 
Using Leibnitz’s differentiation formula once again, the reader may show 
(Exercise 12.5.1) that 


P„(x) and P n m (x) are related by 

p; m (x) = (— ir ^ -~-^ i f„ m W- (12.81a) 

From our definition of the associated Legendre functions, P„(x\ 

P w °(x) - P n (x). (12.82) 


In addition, we may develop Table 12.3. 

As with the Legendre polynomials, a generating function for the associated 
Legendre functions does exist : 


(2m)!(l - x 2 T 12 
2 m m !(1 - 2tx + t 2 ) m+1/2 


I P? +m {x)t'. 


s — 0 


(12.83) 


However, because of its more cumbersome form and lack of any direct physical 
application, it is seldom used. 


Recurrence Relations 

As expected, the associated Legendre functions satisfy recurrence relations. 
Because of the existence of two indices instead of just one, we have a wide 
variety of recurrence relations: 


2 Occasionally (as in AMS-55), the reader will find the associated Legendre 
functions defined with an additional factor of (— l) m . This (— l) w seems an 
unnecessary complication at this point. It will be included in the definition 
of the spherical harmonics Y™(6, (p) in Section 12.6. 
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TABLE 12.3 Associated Legendre 
Functions 


Pl(x) = (1 -x 2 ) 112 = sin0 
Pi(x) = 3x( 1 — x 2 ) 1 ' 2 = 3 cos 0 sin 6 
P 2 (x) = 3(1 — x 2 ) = 3 sin 2 9 

P 3 (a) = §(5x 2 - 1) (1 - x 2 ) 1/2 = §(5cos 2 0 - l)sin0 
P|(x) = 15x(l — x 2 ) = 15cos0sin 2 0 
A'(.v) = 15(1 — x 2 ) 3/2 = 15 sin 3 0 

P 4 ’(x) = §(7x 3 - 3x)(l - x 2 ) I/2 = f(7 cos 3 0 - 3 cos 0) sin 0 
( 4 W = ^tOx 2 — 1)(1 — x 2 ) = (7 cos 2 0 — 1) sin 2 0 

P 4 3 (x) = 105x0 - x 2 ) 312 = 105 cos0 sin 3 0 
P?(x) = 105(1 - x 2 ) 2 = 105 sin 4 0 


p n +1 - {l _ x2) m P » + [»(» + 1) - «(m - W 1 = 0, (12.84) 

(2n + 1 )*P„ m = (n + m)P^L 1 + (n - m + 1 )P„ m +1 , (12.85) 

(2n + 1)(1 - x 2 ) 1/2 P„ m 

pm + 1 pm + 1 

^w + l r n~ 1 

= (n + m)(n + m - 1 )F n m T 1 1 - (* - m 4- 1)(« - m + 2)P J1 '" f T 1 1 , (12.86) 

(1 - x 2 ) 1/2 P; r/ = ^P' w+1 - i(u + m)(n - m + lJP*" 1 . (12.87) 

These relations, and many other similar ones, may be verified by use of the 
generating function (Eq. 12.4), by substitution of the series solution of the 
associated Legendre equation (12.79) or reduction to the Legendre polynomial 
recurrence relations, using Eq. 12.81. As an example of the last method, consider 
the third equation in the preceding set. It is similar to Eq. 12.23 : 


(2 n + 1 )P n (x) = P; +i (x) - (12.88) 

Let us differentiate this Legendre polynomial recurrence relation m times to 
obtain 




Jm + 1 


(12.89) 


dx l 


i+i 




Now multiplying by (1 — x 2 ) (m+1)/2 and using the definition of P™(x), we obtain 
Eq. 12.86. 


Parity 

The parity relation satisfied by the associated Legendre functions may be 
determined by examination of the defining equation (12.81). As x -► — x, we 
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already know that P n (x) contributes a ( — 1)". The m-fold differentiation yields a 
factor of(~l) m . Hence we have 

P„ m (-x) = (- 1 ) n+m P?(x). (12.90) 

A glance at Table 12.3 verifies this for 1 < m < n < 4. 

Also, from the definition in Eq. 12.81 

p-(+l) = 0, for m^O. (12.91) 


Orthogonality 

The orthogonality of the P™(x) follows from the differential equation just as 
in P n (x) (Section 12.3); the term — m 2 /( 1 — x 2 ) cancels out, assuming m is the 
same in both cases. However, it is instructive to demonstrate the orthogonality 
by another method, a method that will also provide the normalization constant. 

Using the definition in Eq. 12.81 and Rodrigues’ formula (Eq. 12.65) for P n (x), 
we find 



P™(x)P” l (x)dx 


(Z -jr 

2 p+q plq! 


1 AP+ m 

xm h^ x 'i^ x ' dx - 


(12.92) 


The function X is given by X = (x 2 — 1). If p j= q, let us assume that p < q. 
Notice that the superscript m is the same for both functions. This is an essential 
condition. The technique is to integrate repeatedly by parts; all the integrated 
parts will vanish as long as there is a factor X = x 2 — 1. Let us integrate q + m 
times to obtain 



p;(x)p™(x)dx 


(— in - \rz f 1 iZi/V-iZ L vA 

2 p+q p\q\ dx p+m ) 


X“dx. 


(12.93) 


The integrand on the right-hand side is now expanded by Leibnitz’s formula to 
give 


va gZ /y, ^ y P \ _ y/T " (g + «) ! i^~ l d*™ 1 

d x q+m y dx p+m J £ Zo i\(q 4- m — /')! dx q+m 1 dx p+m+i 

(12.94) 


Since the term X m contains no power of x greater than x 2w , we must have 

q + m — i <2m (12.95) 

or the derivative will vanish. Similarly, 

p + m + i < 2p. (12.96) 


In the solution of these equations for the index i the conditions for a nonzero 
result are 


i > q — m, i < p — m. (12.97) 

If p < q, as assumed, there is no solution and the integral vanishes. The same 
result obviously must follow if p > q. 
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For the remaining case, p = q, we may still have the single term correspond- 
ing to i — q — m. Putting Eq. 12.94 into Eq. 12.93, we have 

r rpmr wa (-ir 2w (q + m)! P X J d 2m xm \( d* 


Since 


X m = ( x 2 - l) m = x 2m - mx 2m ~ 2 

d 2m 


dx 


2m 


X m = (2m)!. 


Eq. 12.98 reduces to 


P \p m (x)] 2 dx = - 1)** 2m (2q)'Aq + m)' : f x q dx 

J lP q { X )]dX 2 2 qq]q[(q _ m)l ]_***• 

(— j)«2 2 « +1 q!g! 


The integral on the right is just 


(- 1) 9 


sin 2 ® +1 Odd = 


(29 + 1)! 


X q Jdx. 

(12.98) 

(12.99) 

( 12 . 100 ) 

( 12 . 101 ) 

(12.102) 


(compare Exercise 10.4.9). Combining Eqs. 12.101 and 12.102, we have the 
orthogonality integral 



P;‘(x)P?(x)d X 


2 (g + m)! , 

2q + 1 (q — m)[ p ’ q 


(12.103) 


or, in spherical polar coordinates, 

j* 11 P™(cos 6)P™(cos 6) sin 0 dO = (12104) 

The orthogonality of the Legendre polynomials is actually a special case of 
this result, obtained by setting m equal to zero; that is, for m = 0, Eq. 12.103 
reduces to Eqs. 12.43 and 12.48. In both Eqs. 12.103 and 12.104 our Sturm- 
Liouville theory of Chapter 9 could provide the Kronecker delta. A special 
calculation, such as the analysis here, is required for the normalization constant. 

The orthogonality of the associated Legendre functions over the same interval 
and with the same weighting factor as the Legendre polynomials does not con- 
tradict the uniqueness of the Gram-Schmidt construction of the Legendre 
polynomials, Example 9.3.1. Table 12.3 suggests (and Section 12.4 verifies) that 

Pp(x)Pq(x)dx may be written as 

P p;(x)p?(x)(l - x 2 rdx. 


Here 


p"(x)(i - x 2 p 2 = p;(x). 
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The functions p™(x ) may be constructed by the Gram-Schmidt procedure with 
the weighting function w(x) = (1 — x 2 ) m . 

It is possible to develop an orthogonality relation for associated Legendre 
functions of the same lower index but different upper index. We find 

j 1 P;(x)P»(l - x 2 )- 1 dx = (121 ° 5) 

Note that a new weighting factor, (1 — x 2 ) -1 , has been introduced. This form is 
essentially a mathematical curiosity. In physical problems orthogonality of the 
cp dependence ties the two upper indices together and leads to Eq. 12.104. 

EXAMPLE 12.5.1 Magnetic Induction Field of a Current Loop 

Like the other differential equations of mathematical physics, the associated 
Legendre equation is likely to pop up quite unexpectedly. As an illustration, 
consider the magnetic induction field B and magnetic vector potential A created 
by a single circular current loop in the equatorial plane (Fig. 12.13). 


y 


FIG. 12.13 Circular current loop 

We know from electromagnetic theory that the contribution of current 
element l dX to the magnetic vector potential is 

= . (12.106) 
4 n r 

(This follows from Exercise 1.14.4). Equation 12.106, plus the symmetry of our 
system, shows that A has only a <p 0 -component and that the component is 
independent of <p 3 

A = <p 0 A^(r,6). (12.107) 



3 Pair off corresponding current elements Idh((p Y ) and IdA(<p 2 ), where 
(p - <Pi = <Pi - <P- 
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By Maxwell’s equations 

VxH = J, (dD/dt = 0, SI units). (12.108) 

Since 


we have 


/r 0 H = B = V x A, 


(12.109) 


VxVxA = (i„J, (12.110) 

where J is the current density. In our problem J is zero everywhere except in the 
current loop. Therefore, away from the loop, 


V x V x (p o ^M) = 0, (12.111) 

using Eq. 12.107. 

From the expression for the curl in spherical polar coordinates (Section 2.5), 
we obtain (Example 2.5.2) 


V x V x <p 0 A^r, 6) = <p 0 

= 0 . 


S 2 A v 2dAy 1 cfAy 
dr 2 r dr r 2 dO 2 


( 12 . 112 ) 


Letting A v (r, 0) — R(r)0(O) and separating variables, we have 


2 d 2 R . „ dR 


dr 


2 + 2-~ - n(n + 1 )R = 0, 
dr 


^ + cot ^ + n(n+1)0 __0_ = o . 


do 2 


(12.113) 

(12.114) 


The second equation is the associated Legendre equation (12.80) with m — 1, and 
we may immediately write 

0(61) = P* (cos 0). (12.115) 

The separation constant n(n + 1) was chosen to keep this solution well behaved. 

By trial, letting R(r ) = r 2 , we find that a = n, —n — 1. The first possibility is 
discarded, for our solution must vanish as r -► oo. Hence 

Kn = ~hPn( COS0) = PjiCOsd) (12.116) 

and 

A Jr, 6) = £ c„ P' n (cos 6), (r > a). (12. 1 17) 

Here a is the radius of the current loop. 

Since must be invariant to reflection in the equatorial plane by the sym- 
metry of our problem, 

^(r,cos0) = A^r, -cos 0), (12.118) 

the parity property of P n m (cos61) (Eq. 12.90) shows that c n = 0 for n even. 
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To complete the evaluation of the constants, we may use Eq. 12.117 to cal- 
culate B z along the z-axis [B z = B r (r,6 = 0)] and compare with the expression 
obtained from the Biot and Savart law. This is the same technique that is used 
in Example 12.3.3. We have (compare Eq. 2.47) 

B r = V x A| r 

i (12.H9) 

rsm9[d6 

_ cot 9 1 SA 9 

r * r 89 ‘ 


Using 


cP' n (cos 9) 
89 


— sinO 


dP l Jcos9) 
d( cos 9) 


_ _1 P 2 . n(n + 1) no 

2 r n ' 2 r n 


( 12 . 120 ) 


(Eq. 12.87) and then Eq. 12.84 with m = 1: 

P% (cost)) ~^f P n^os9) + n(n + l)P„(cos0) = 0, (12.121) 


we obtain 


00 n n + 1 

B,(r,0) = £ c„n(n + l)^P„(cos0), r > a (12.122) 


f« = l 


(for all 9). In particular, for 0 = 0, 


B r {r,0) = £ c n n(n + 1)^. 


H = 1 


We may also obtain 


B„(r,0)= — 


_ 1 8{rA v ) 


r dr 

00 a n+1 
r 


oo + * 

= Z c r,n~^ 2 P'„(co& 9), r > a. 


n = 1 


(12.123) 


(12.124) 


The Biot and Savart law states that 

dB = ^ I — * — 0 (SI units). (12.125) 

4n r 2 

We now integrate over the perimeter of our loop (radius a). The geometry is 
shown in Fig. 12.14. The resulting magnetic induction field is k£ z , along the 
z-axis, with 
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FIG. 12.14 Law of Biot and Savart applied to a circular loop 


B 2 = ^-a 2 (a 2 + z 2 )- 312 


¥$(‘+ST 


Expanding by the binomial theorem, we obtain 


R _/*o la 2 
B *-~2T 3 


>-§(! 


Cj — Ca — • 


c„ = (-l) (n_1V2 - 

A 

Equivalently, we may write 


2n(n + 1) [(n — l)/2] !(i)!’ 


nodd. 


(12.126) 


Equating Eqs. 12.123 and 12.127 term by term (with r = z), 4 we find 
, -Bo 1 „ _ _BqL , n 


(12.127) 


(12.128) 


The descending power series is also unique. 
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and 




(2 n)! /i 0 7 (2n - 1) ! ! 

«!(« + !)! 1 2 (2n + 2) ! ! 


A lf ,(r, 6) = ^ I c 2 „ +1 Q P^+i (cos 0), 
Br(r,0) = 4 S c 2n+1 (2n + l)(2n + 2)( -Y " P 2n+1 

r rt=0 \ r ) 

B e (r,6) = c 2 n+i( 2 n + lY-Y Vi n+1 (cos0). 

r n = 0 \ r / 


(cos 0), 


(12.129) 

(12.130) 

(12.131) 

(12.132) 


These fields may be described in closed form by the use of elliptic integrals. 
Exercise 5.8.4 is an illustration of this approach. A third possibility is direct 
integration of Eq. 12.106 by expanding the factor 1/r as a Legendre polynomial 
generating function. The current is specified by Dirac delta functions. These 
methods have the advantage of yielding the constants c n directly. 

A comparison of magnetic current loop dipole fields and finite electric dipole 
fields may be of interest. For the magnetic current loop dipole the preceding 
analysis gives 


2 r J 


P i 


B e (r,6) 


4 r 3 




(12.133) 

(12.134) 


From the finite electric dipole potential of Section 12.1 we have 


£,M) 





(12.135) 

(12.136) 


The two fields agree in form as far as the leading term is concerned (r 3 Pi), and 
this is the basis for calling them both dipole fields. 


(r, e , <p) 



As with electric multipoles, it is sometimes convenient to discuss point mag- 
netic multipoles. For the dipole case, Eqs. 12.133 and 12.134, the point dipole is 
formed by taking the limit a 0 , 1 oo with la 2 held constant. With n a unit 
vector normal to the current loop (positive sense by right-hand rule, Section 
1.10) the magnetic moment m is given by m — n Ina 2 . 
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EXERCISES 

12 . 5.1 Prove that 




where P™(x) is defined by 


(n + m)\ 

1 d n+m 

Hint. One approach is to apply Leibnitz’s formula to (x + l)"(x — 1)". 
12 . 5.2 Show that 

^( 0 ) = 0 

P i mt- r iy ( 2 " +1 ) ! - ( iy , (2n + 1)!! . 

p 2 „ +1 (0)-( i) (2 „ n!)2 -( i) (2n)!! . ■ 

by each of the three methods: 

(a) use of recurrence relations, 

(b) expansion of the generating function, 

(c) Rodrigues’ formula. 


12 . 5.3 Evaluate P™( 0). 

ANS. P™( 0) = 1 


(- 1 )' 


0, 


in—m)j2 _ 


(n — m)\ 


,^±my 


Also, P.-(O) - (- ir --o ( ".±. " 1 

(n - m ) ! ! 

12 . 5.4 Show that 

P”(cos 9) = (2n — 1)! ! sin" 0 , n ~ 0, 1, 2, 

1 2 . 5.5 Derive the associated Legendre recurrence relation 


n + m even, 

n 4- m odd. 
n + m even. 


C +1 « - ; 


2mx 


jP„ m (x) + [n(n + 1) - m(m - l)]/’”' 1 **) = 0. 


(1 - x 2 ) 112 

1 2 . 5.6 Develop a recurrence relation that will yield P„ l (x) as 

Pn(x) = /!(x,n)P n (x) + /.(x^P^Cx). 

Follow either (a) or (b). 

(a) Derive a recurrence relation of the preceding form. Give /j (x, n) and / 2 (x, n ) 
explicitly. 

(b) Find the sought for recurrence relation in print. 

(1) Give the source. 

(2) Verify the recurrence relation. 

ANS - 

sin 0P„'( cos 6) — P" 1 (cos 6). 


12 . 5.7 Show that 
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12.5.8 Show that 


(a) 

(b) 


"(dPTdp; m 2 p;c \ 
0 \dd de sin 2 0 ) 


* n / 

Jo V 


Pi dP\ P\ dPi 


sin 6 d6 sin# d0 


. 2 n(n + 1) (n + m) ! - 

sin 0d0= - ~ ~ 7 7- S ni 

2n + 1 (n — m ) ! 

sin 6 d6 = 0. 


These integrals occur in the theory of scattering of electromagnetic waves by 
spheres. 


1 2.5.9 As a repeat of Exercise 12.3.6, show, using associated Legendre functions, that 


£ 


x(l - x 2 )P;{x)P„(x)dx ■■ 


nl 


n + 1 


In + 1 In - 1 (n - 2)\ 


n 


2 (n + 2)! , 


2n + 1 2n + 3 n ! 


L, m,n + 1 • 


12.5.10 Evaluate 


f 


sin 2 ftfVfcos 6)d0. 


1 2.5.1 1 The associated Legendre polynomial P n m (*) satisfies the self-adjoint differential 
equation 

(1 - x 2 )P:"(x) - 2 xP:\x) + j^t(n + 1) - J~}p?(x) = 0. 

From the differential equations for P„ m (x) and P„ fc (x) show that 


for k jz m. 




P„ m (x)P„ k (x) 


dx 


= 0 


12.5.12 


Determine the vector potential of a magnetic quadrupole by differentiating 
the magnetic dipole potential. 


ANS. A m0 ~ ^-(Ia 2 )(dz)ip 0 - ' 2 — + higher-order terms. 

2 


B m<2 = fi 0 (Ia 2 )(dz) 


Tq 3^2 (cos 0) + ^ P\ (cos 0) ~| 




This corresponds to placing a current loop of radius a at z = dz , an oppositely 
directed current loop at z = — dz, and letting a -► 0 subject to (dz)x (dipole 
strength) equal constant. 

Another approach to this problem would be to integrate dA (Eq. 12.106), 
to expand the denominator in a series of Legendre polynomials, and to use the 
Legendre polynomial addition theorem (Section 12.8). 


1 2.5.1 3 A single loop of wire of radius a carries a current /. 

(a) Find the magnetic induction B for r < a. 

(b) Calculate the integral of the magnetic flux (B 'do) over the area of the 
current loop, that is. 



d(p r dr. 


ANS. x. 
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The earth is within such a ring current in which / approximates millions 
of amperes arising from the drift of charged particles in the Van Allen belt. 

12.5.14 (a) Show that in the point dipole limit the magnetic induction field of the 
current loop becomes 

B r (r,0) = ^™P 1 (cos0) 

2 n 

= cosO), 

4 n r 

with m = Ina 2 . 

(b) Compare these results with the magnetic induction of the point magnetic 
dipole of Exercise 1.8.17. Take m = km. 

1 2.5.1 5 A uniformly charged spherical shell is rotating with constant angular velocity. 

(a) Calculate the magnetic induction B along the axis of rotation outside the 
sphere. 

(b) Using the vector potential series of Section 12.5, find A and then B for 
all space outside the sphere. 

12.5.16 In the liquid drop model of the nucleus the spherical nucleus is subjected to 
small deformations. Consider a sphere of radius r 0 that is deformed so that 
its new surface is given by 

r = r 0 [l + a 2 P 2 (cos0)]. 

Find the area of the deformed sphere through terms of order a 2 . 

Hint. 

1/2 

rs'mOdO d<p. 

ANS. A = 47tro [1 + Wi + C?(a|)]. 
Note. The area element dA follows from noting that the line element ds for 
fixed <p is given by 

ds = ( r 2 dO 2 + dr 2 ) 112 = (r 2 + (dr/dO) 2 ) 112 dO. 

1 2.5.1 7 A nuclear particle is in a potential V(r, 0, <p) = 0 for 0 < r < a and oo for r > a. 
The particle is described by a wave function i j/(r, 0 , (p ) which satisfies the wave 
equation 

~ jV 1 * + Vo* = E + 

2 M 

and the boundary condition 

i j/(r = a) = 0. 

Show that for the energy £ to be a minimum there must be no angular de- 
pendence in the wave function; that is, \j/ = 

Hint. The problem centers on the boundary condition on the radial function. 

12.5.18 (a) Write a subroutine to calculate the numerical value of the associated 

Legendre function Ph(x) for given values of N and x. 

Hint. With the known forms of P} and Pj you can use the recurrence 
relation Eq. 12.85 to generate P^ N > 2. 

(b) Check your subroutine by having it calculate Ps(x) for x = 0.0(0.5)1.0 
and N = 1(1)10. Check these numerical values against the known values 
of P^( 0) and P^( 1) and against the tabulated values of P^(0.5). 


dA 


= " 2 (~\ 
" + \ de ) 
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12 . 5.19 Calculate the magnetic vector potential of a current loop, Example 12.5.1. 

Tabulate your results for r/a = 1.5(0. 5)5.0 and 6 = 0°(15 o )90°. Include terms 
in the series expansion, Eq. 12.130, until the absolute values of the terms drop 
below the leading term by a factor of 10 5 or more. 

Note. This associated Legendre expansion can be checked by comparison with 
the elliptic integral solution. Exercise 5.8.4. 

Check value. For r/a = 4.0 and 0 = 20°, 
AJijl 0 1 = 4.9398 x 10~ 3 . 


12.6 SPHERICAL HARMONICS 


In the separation of variables of (1) Laplace’s equation, (2) Helmholtz’s or 
the space-dependence of the classical wave equation, and (3) the Schrodinger 
wave equation for central force fields, 

\ 2 \J/ + k 2 f(r)\jj = 0, (12.137) 

the angular dependence, coming entirely from the Laplacian operator, is 1 

0(0) d 2 Q>((p) 


sin 6 dO 


®(<p) d ( . n d® 
— — 1 sin 0—— 


dO 


+ 


sin 2 0 dtp 2 


+ n(n + 1 )0(0)<D(<p) = 0. (12.138) 


Azimuthal Dependence — Orthogonality 

The separated azimuthal equation is 


1 d 2 <b{q>) 

O (cp) dq> 2 


-m 


with solutions 

0 (<?) = 6 >" im<p , 


(12.139) 

(12.140) 


which readily satisfy the orthogonal condition 

\ 2n e-^e^dcp = 2nd mi>m2 . (12.141) 

Jo 

Notice that it is the product 0* i (<p)O m2 ((p) that is taken and that * is used to 
indicate the complex conjugate function. This choice is not required, but it is 
convenient for quantum mechanical calculations. We could have used 


= sin mcp, cos mcp (12.142) 

and the conditions of orthogonality that form the basis for Fourier series 
(Chapter 14). For applications such as describing the earth’s gravitational or 
magnetic field sin m<p and cos mcp would be the preferred choice (see Example 
12.6.1). 

In electrostatics and most other physical problems we require m to be an 
integer in order that 0((p) may be a single- valued function of the azimuth angle. 


1 For a separation constant of the form n(n + 1) with « an integer, a Legendre 
equation series solution becomes a polynomial. Otherwise both series solu- 
tions diverge, Exercise 8.5.5. 
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In quantum mechanics the question is much more involved because the observ- 
able quantity that must be single-valued is the square of the magnitude of the 
wave function, <X>*®. However, it can be shown that we must still have m integral. 
Compare footnote in Section 8.3. 

By means of Eq. 12.141, 


<X> 


m 



(12.143) 


is orthonormal (orthogonal and normalized) with respect to integration over 
the azimuth angle cp. 


Polar Angle Dependence 

Splitting off the azimuthal dependence, the polar angle dependence (0) leads 
to the associated Legendre equation (12.80), which is satisfied by the associated 
Legendre functions; that is, 0(0) = P™(cos0). To include negative values of m, 
we use Rodrigues* formula, Eq. 12.65, in the definition of P„ m (cos0). This leads 
to 


1 d m+n 

C(cos0) = 2^ (1 " x2)m/2 ib^ ( * 2 ~ ir ’ ^ m ^ (12.144) 

P„ m ( cos 0) and P~ m ( cos 0) are related as indicated in Exercise 12.5. 1. An advantage 
of this approach over simply defining P™(cos 0) for 0 < m < n and requiring 
that P~ m = P„ m is that the recurrence relations valid for 0 < m < n remain valid 
for —n < m < 0. 

Normalizing the associated Legendre function by Eq. 12.103, we obtain the 
orthonormai function 

^„ m (cos 0) = 2 * ^» m ( CQS —n<m< n. (12.145) 

Spherical Harmonics 

The function O m (<p) (Eq. 12.143) is orthonormal with respect to the azimuthal 
angle cp, whereas the function ^"(cos 0) (Eq. 12.145) is orthonormal with respect 
to the polar angle 0. We take the product of the two and define 

Y,: n (e, (p) = (- Yrl^ ~ ;- fe( c os ())e^ (12.146) 

to obtain functions of two angles (and two indices) which are orthonormal over 
the spherical surface. These Y n m (0, cp) are spherical harmonics. The complete 
orthogonality integral becomes 

r r Y^cpjY^vlsinededy = S^d^. (12.147) 

J <p=0 J 0=0 

The extra (— l) m included in the defining equation of Y„ m (0, cp) deserves some 
comment. It is clearly legitimate, since Eq. 12.137 is linear and homogeneous. 
It is not necessary, but in moving on to certain quantum mechanical calculations, 
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TABLE 12.4 Spherical 
Harmonics (Condon- 
Shortley Phase) 


y o °( 0 , <?) = -= 

V 4n 

Y}(9, (p) = - ~ sin 6e‘‘ p 

Y°(9, <p) = f-f- cos 0 
Y 1 ~ 1 (6,(p)=+ f^sm6e~‘ 

\ o7T 


Y 2 2 (6,<f>)=l^3sm 2 0e 2i « 


y 2 1 (0, (p) = — j — - 3 sin 0 cos 0e lVp 




y, 1 (0, a>) = + / 3 sin 9 cos 0c ^ 

2 ^ -'24re 


Yi 2 (d,cp)= l^3sm 2 6e- 2i * 


particularly in the quantum theory of angular momentum (Section 12.7), it is 
most convenient. The factor ( — l) m is a phase factor, often called the Condon- 
Shortley phase, after the authors of a classic text on atomic spectroscopy. The 
effect of this (-IF 1 (Eq. 12.146) and the (-l) w of Eq. 12.81a for P~ m (cos0) 
is to introduce an alternation of sign among the positive m spherical harmonics. 
This is shown in Table 12.4. 

The functions T„ m (0, (p ) acquired the name “spherical harmonics” first because 
they are defined over the surface of a sphere with 6 the polar angle and (p the 
azimuth. The “harmonic” was included because solutions of Laplace’s equation 
were called harmonic functions and 7 rt m (0, (p) is the angular part of such a 
solution. 

In the framework of quantum mechanics Eq. 12.138 becomes an orbital 
angular momentum equation and the solution Y^(0, cp) (n replaced by L, m, 
by M) is an angular momentum eigenfunction: L being the angular momentum 
quantum number and M the z-axis projection of L. These relationships are 
developed in detail in Section 12.7. 

Laplace Series, Fundamental Expansion 
Theorem 

Part of the importance of spherical harmonics lies in the completeness 
property, a consequence of the Sturm-Liouville form of Laplace’s equation. 
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This property, in this case, means that any function f(d><p) (with sufficient 
continuity properties) evaluated over the surface of the sphere can be expanded 
in a uniformly convergent double series of spherical harmonics 2 (Laplace’s 
series). 

fUK <P) = £ Y„ m (9, cp). (12.148) 

m,n 

Iff (6, (p) is known, the coefficients can be immediately found by the use of the 
orthogonality integral. Within the framework of the theory of linear vector 
spaces, the completeness of the spherical harmonics follows from Weierstrass’s 
theorem. 

EXAMPLE 12.6.1 Laplace Series — Gravity Fields 

The gravity fields of the earth, moon, and Mars have been described by a 
Laplace series with real eigenfunctions: 

u(r , ^ = ^[ 7 - 1 t o {q* 4 e „(0, <p) + s nm y: h (8, <p)j . 

(12.148a) 

Here M is the mass of the body, R the equatorial radius. The real functions 
YL and Y° n are defined by 

Y*„(0, cp) = F n m (cos 8) cos mcp 

Y° n (9, cp) = P™(cos 8) sin mcp. 

For applications such as this the real trigonometric forms are preferred to the 
imaginary exponential form of Y^{8,<p). Satellite measurements have led to 
the numerical values shown in Table 12.5. 


TABLE 12.5 Gravity Field Coefficients, 
Eq. 12.148a 



Earth 

Moon 

Mars 

E*20 

1.083 x 10“ 3 * 

(0.200 ± 0.002) x 10' 3 

(1.96 + 0.01) x lO' 3 

Q 2 

0.16 x 10~ 5 

(2.4 ± 0.5) x 10" 5 

(-5 ± 1) x 10' 5 

$22 

-0.09 x 10" 5 

(0.5 ± 0.6) x 10' 5 

(3 + 1) x 10‘ 5 


C 20 represents an equatorial bulge, whereas C 22 and S 22 represent an azimuthal 
dependence of the gravitational field. 


2 For a proof of this fundamental theorem see E. W. Hobson, The Theory of 

Spherical and Ellipsoidal Harmonics. New York : Chelsea (1955), Chapter VII. 

If f(6 , cp ) is discontinuous we may still have convergence in the mean, Section 

9.4. 
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EXERCISES 

12.6.1 > Show that the parity of Y^(B y (p) is (— 1) L . Note the disappearance of any M 

dependence. 

Hint . For the parity operation in spherical polar coordinates see Section 2.5 
and a footnote in Section 12.2. 

12.6.2 Prove that 

Y^0,<p) = (^lJ 2 6 MO . 

12.6.3 In the theory of Coulomb excitation of nuclei we encounter Y^(n/2,0). Show 
that 

ytfl a\ = / 2L + 1 V /2 [jL - M) \(L + M) i] 1 ' 2 . , L+M)/2 

L ^2’ J V 4rr J (L - + M)'.! ( 1 

for L + M even 
= 0 for L + M odd. 

Here (2 n) ! ! = 2n(2n — 2) • • • 6 • 4 • 2, 

(2 n + 1)! ! = (2n + l)(2n - 1) • • • 5 • 3 • I. 


12.6.4 (a) Express the elements of the quadrupole moment tensor x t x y as a linear 

combination of the spherical harmonics Y™ (and Y $ ). 

Note. The tensor x t Xj is reducible . The Y$ indicates the presence of a 
scalar component. 

(b) The quadrupole moment tensor is usually defined as 

Qij = (3 XiXj - r 2 r5 y )p(r) dz, 

* 

with p(r) the charge density. Express the components of (3 x t -X; — r 2 <5 y ) in 
terms of r 2 Yf. 

(c) What is the significance of the — r 2 <5 0 term? 

Hint. Compare Section 3.4. 

12.6.5 The orthogonal azimuthal functions yield a useful representation of the Dirac 
delta function. Show that 

1 00 

H<Pi = X exp[i'm(<pj - <p 2 )]. 

^ m=- oo 

12.6.6 Derive fhe spherical harmonic closure relation 

X x w<Pi-<P2) 

1=0 m= -l Sin (7! 

= ^(cos^ — cos0 2 )<5(<pi — <p 2 ). 


1 2.6.7 The quantum mechanical angular momentum operators L x ± iL y are given by 

L x + iL v = e itp (~ + zcot 0-^-Y 


L r — z‘L v = — e 


Show that 
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12.6.8 


(a) (L x + iL t )Yff(d,<p) = +^J(L - M){L + M + l)Y™ + '(0,<p) 

(b) (L x - iL y )Y“(0,q>) = +V(L + M)(L-M + T)Y^-\B,ip\ 
With L ± given by 

show that 


(a) Y? = 

(b) Y? = 


(1 + m ) ! 
(2()!(/-m)! 


C L-t m Y‘, 


(l — m)\ n 


(2l)!(l + m ) ! 


T (L + r m yr 


12.6.9 In some circumstances it is desirable to replace the imaginary exponential of 
our spherical harmonic by sine or cosine. Morse and Feshbach define 

Y‘„ = P"(cos6)cosmcp, 

y m °„ = /’"(cos 0) sin nup, 

where 

f” f[yr ,o (0. <P )] 2 sin ededcp = - 4 ” - for « = L 2, 3, . . . 

Jo Jo 2(2n + l)(n-m)! 

= 4n for n = 0 (y 0 ° 0 does not exist). 

These spherical harmonics are often named according to the patterns of their 
positive and negative regions on the surface of a sphere — zonal harmonics for 
m = 0, sectoral harmonics for m = n, and tesseral harmonics for 0 < m < n. 
For n = 4, m = 0, 2, 4, indicate on a diagram of a hemisphere (one diagram 
for each spherical harmonic) the regions in which the spherical harmonic is 
positive. 


1 2.6.1 0 A function /(r, 6 , cp) may be expressed as a Laplace series 

With < > sphere used to mean the average over a sphere (centered on the origin), 
show that 

</(r, e,q>)> spherc =/(0, 0,0). 


12.7 ANGULAR MOMENTUM AND LADDER 
OPERATORS 

Orbital Angular Momentum 

The classical concept of angular momentum L dassical = r x p is presented in 
Section 1.4 to introduce the cross product. Following the usual Schrodinger 
representation of quantum mechanics, the classical linear momentum p is 
replaced by the operator — i\. The quantum mechanical angular momentum 
operator becomes 1 


^or simplicity, the h is dropped. This means that the angular momentum 
is measured in units of h. 
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L Q M = -it x V. (12.149) 

This is used repeatedly in Sections 1.8, 1.9, and 2.4 to illustrate vector differential 
operators. From Exercise 1.8.6 the angular momentum components satisfy a 
commutation relation 


[L^L,-] = ie ijk L k . (12.150) 

The s ijk is the Levi-Civita symbol of Section 3.4. A summation over the index k 
is understood. 

From Exercises 2.5.12 and 2.5.13 we find 

L z = -if, (12.151) 

d(p 

in spherical polar coordinates. Hence 

L z Y*(8, <p) = MTf(0, <p). (12.152) 


The differential operator corresponding to the square of the angular momentum 

L 2 = L • L = L 2 + L 2 + L\ (12. 153) 

may be determined from 

L •L = — (r x V) *(r x V), (12.154) 

which is the subject of Exercises 1.9.9 and 2.5.17(b). From these we find that 
L • L operating on a spherical harmonic yields 2 


L-L Y*{0i<p) 


Ml 

[sin 6 dd 



+ 


1 d 2 
sin 2 9 d(f ■) 2 


Y?(e,<p\ 


(12.155) 


or 

L-L Y**(Q,(p) - L(L + 1 )Y?(0,<p). (12.156) 

This is Exercise 8.3.1. 

Equation 12. 150 presents the basic commutation relations of the components 
of the quantum mechanical angular momentum. Indeed, within the framework 
of quantum mechanics, these commutation relations define an angular momen- 
tum operator. From Eq. 12.152 our spherical harmonic Y^(6,cp) is an eigen- 
function of L z with eigenvalue M. Finally, from Eq. 12.156, Y^{9,(p) is also an 
eigenfunction of L 2 with eigenvalue L(L + 1). 


General Operator Approach 

Apart from the replacement of p by — i V, the analysis so far has been in terms 
of classical mathematics. Let us start anew with a more typical quantum 
mechanical analysis. 


2 In addition to these eigenvalue equations, the relation of L to rotations of 
coordinate systems and to rotations of functions is examined in Sections 4.10 
to 4.12. 
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1. We assume an Hermitian operator J whose com- 
ponents satisfy the commutation relations . 

WA = ie ijk J k . (12.157) 

Otherwise J is arbitrary. 

2. We assume that is simultaneously a normalized 
eigenfunction (or eigenvector) of J z with eigenvalue 
M and an eigenfunction of J 2 with eigenvalue 
J(J + l): 3 

(12.158) 

3^ jm = J(J + W JM . (12.159) 

Otherwise i j/ JM is assumed unknown. 

Let us see what general conclusions we can develop. Then we shall let our 
general operators J x , J y , and J z become the specific orbital angular momentum 
operators L x , L y , and L z . will then become a function of the spherical polar 
coordinate angles 6 and cp. We derive its form — in terms of Legendre poly- 
nomials and differential operators — and identify it with the spherical harmonic 
Yf(6, (p). This will illustrate the generality and power of operator techniques — 
particularly the use of ladder operators . 4 It will also make clear the basis of the 
Condon-Shortley phase factor, the association of the ( — 1) M with the positive 
M spherical harmonics. 

The ladder operators are defined as 


J + — J X + Uy , 

J-=J X ~ iJy 

In terms of these operators J 2 may be rewritten as 

j 2 = i(j + j_ + j.j + ) + ji 


(12.160) 


(12.161) 


From the commutation relations, Eq. 12.157, we find 


[J z , J + ] = +J+, [J z , J-] = -J-> [J + , J-] - 2J Z . (12.162) 

Since J + commutes with J 2 (Exercise 12.7.1), 

j = -UJ 2 *,*) = J(J + 1 )(J + *jm). (12.163) 

Therefore, J + ^ JM is still an eigenfunction of J 2 with eigenvalue J(J + 1). 
Similarly, for . But from Eq. 12.162 


J Z J + =J + (J Z + 1), (12.164) 

or 

= J+(J Z + l)^iM = (M + (12.165) 


3 That \j/ JM is an eigenfunction of bothJ z and J 2 is a consequence of [J 2 , J 2 ] =0. 

4 Ladder operators can be developed for other mathematical functions. 
Compare Section 13.1 for Hermite polynomials. 
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Therefore J + t//, w is still an eigenfunction of J. but now with eigenvalue M + 1. 
J + has raised the eigenvalue by 1 and so is often called a raising operator. 
Similarly, J_ lowers the eigenvalue by 1 and so is often called a lowering operator. 

With respect to rotations (J 2 , J z , J + , J_), the ipju form an irreducible, in- 
variant subspace; M varies and J is fixed. In Section 4.10 this property appears 
as the rotation group operating on the spherical harmonics, T, m ; m varies and l 
is fixed. 

Now what is the effect of letting first J+ and then operate on t l/ JM l The 
answer comes from expressing J_J + (and J + J_) in terms of J 2 and J z . From 
Eqs. 12.157 and 12.161, 


J- J + = J 2 - JAJ z + 1), 
J + J- = J 2 - J 2 (J Z ~ !)• 


(12.166) 


Then using Eqs. 12.158, 12.159, and 12.166, 

J- J + h« = V(J + 1) - M(M + l)]^ M = (J - M)(j + M+ 

(1.2. loV j 

J + xl> JM = [J(J + 1) - M(M - 1)]^M =(J + M)(J -M + 1 )ilr JM . 

Now, multiply by i//* A , and integrate (over all angles for the spherical harmonics). 
Since the \j/ JM have been assumed normalized, 


il/f M J + ^/ JM dz — (J — M){J + M + 1) 2; 0, 

J (12.168) 

J + J- hstdr = {J + M)(J - M + 1) > 0. 

The >0 part is worth a comment. In the language of quantum mechanics, J + 
and J are Hermitian conjugates, 5 

J\=J Jl=J + . (12.169) 

Examples of this are provided by the matrices of Exercises 4.2.13 (spin j), 4.2.15 
(spin 1), and 4.2.18 (spin 3/2). Therefore 

J.J + =JIJ + , (12.170) 

and the expectation values, Eq. 12.168, must be positive or zero. 6 For our par- 

ticular orbital angular momentum ladder operators, L + and L_, explicit forms 
are given in Exercises 2.5.14 and 12.6.7. The reader can show (Exercise 12.7.2) 
that 

f yf*L_(L + Yf)dQ= f(L + yf)*(L + yf)dQ. (12.171) 


5 The Hermitian conjugation or adjoint operation is defined for matrices in 
Section 4.5, for operators in general in Section 9.1. 

6 For an excellent discussion of adjoint operators and Hilbert space see 
A. Messiah, Quantum Mechanics , Chapter 7. New York : Wiley (1961). 
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This is a sort of integration by parts (with the extra minus sign in canceled 
by the minus sign in the integration by the parts formula). Actually the equality 
is most easily verified by evaluating each side of Eq. 12.171, using Exercise 12.6.7. 

From the right-hand side of Eq. 12.171 it is clear that the >0 in Eq. 12.168 is 
valid. With the >0 justified, we must have M restricted to the range — J < 
M < J. 

Since J + raises the eigenvalue M to M + 1, we relabel the resultant eigen- 
function The normalization is given by Eq. 12.168 as 

J + 1>JU = y/V - M )( J + M + Vh, M +i> (12.172) 

taking the positive square root and not introducing any phase factor. By the 
same arguments 

J-*jm = V(J + M)(J-M + (12.173) 

Both ij/j'M+i and remain normalized to unity. An explicit calculation of 

these results (using known ladder operators and known spherical harmonics) is 
the topic of Exercise 12.6.7. In Eqs. 12.172 and 12.173 the positive square root 
has been taken. Then the relative phase of ^ JsM±1 and \j/ JM is determined by the 
ladder operators. 

Repeated application of J + leads to 

(J+Ujm = C JM J JfM+n . (12.174) 

This operation must stop at M' — M 4- n = J, or else we would jump to M f > J 
and be in contradiction with the conclusion from Eq. 12.168, M < J. Equiva- 
lently, we may say that whatever M max is, since J + \j/ JM = 0, the left-hand side 
of Eq. 12.172 is zero, and therefore the right-hand side is zero. This yields 
M max = J. In the same fashion, 

(12.175) 

must terminate at M" = M — n = — J. We conclude from this first, that 

= 0, = 0. (12.176) 

Second, since M ranges from -f J to — J in unit steps, 2 J must be an integer. J is 
either an integer or half of an odd integer. As seen later, orbital angular momen- 
tum is described with integral J. But from the spins of some of the fundamental 
particles and of some nuclei, we get J = j, \ , f , • • • . Our angular momentum is 
quantized — essentially as a result of the commutation relations. 

Orbital Angular Momentum Operators 

Now we return to our specific orbital angular momentum operators, L x , L y , 
and L z . Equation 12.158 becomes 

L 2 \j/ LM (0, cp) = M\j/ LM {6, cp). 

The explicit form of L z indicates that (p) has a <p dependence of e iM<tl — with 

M an integer to keep i^ LM single-valued. And if M is an integer, then L is an 
integer also. 
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To determine the 9 dependence of ^ LM (0, (p ), we proceed in two main steps: 
(1) the determination of i// ll (9 , cp ) and (2) the development of ij/ LM (9 , <p) in terms 
of ij/ LL with the phase fixed by ij/ L0 . 

Let 


1 I'lm(9,(p) = ® LM (9)e iM<f> . (12.177) 

From Eq. 12.176, using the form of L+ given in Exercises 2.5.14 and 12.6.7, we 
have 

— L cot ®ll(8) — 0, (12.178) 

and 


,i(L+l) V 


d_ 

de 


<P) = c l sin'-Qe' 1 '’'’. 


Normalizing, we obtain 


* r 

c * c Lj o 


Si n 2£.+ 1 Qd0d(p — 1. 


(12.179) 

(12.180) 


The 6 integral may be evaluated as a beta function (Exercise 10.4.9) and 


c, = 


(2 L + 1)!! 
' 4«(2L)!! 


V(2l)! I 2L + 1 

2 l L! yj 4n 


(12.181) 


This completes our first step. 

To obtain the \j/ LM9 M ^ ±L, we return to the ladder operators. From 
Eqs. 12.172 and 12.173 (J + replaced by L + and J_ replaced by L_), 


<?) = 


' Plm(0, <P) = 


/ (L + M)! 
(2L)!(L — M)! 

/ (L-M)! 

' (2L)!(L + M)! 


(L_r M «M0,<p), 

(L + ) i+M iK,_ L (0,<p). 


(12.182) 


Again, note that the relative phases are set by the ladder operators. L + and 
L_ operating on @ LM (9)e iM<p may be written as 


L + ® 


u9 




- -e m+1),p sin 1+M 6 


d( cos 0) 


sin 


U 


L_0 LM (0)e iM(O = -£- + Mcot0 

|_ac 7 


(12.183) 




= e i(M ~ 1),p sin 1 ~ M 6 


d(cos6) 


sin M 9@ LM (6). 


Repeating these operations n times yields 
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(L + f® LM (d)e iM * = ( - iye m+n) *sm n+M 6 - sin~ M g0 LM (0) ; 


d ( cos Of 


d n 


(L_)"© LM (0)e ^M ‘ ,, = e*"-'* sm M 0Q LM (0). 

From Eq. 12.182 


and for M = — L 


r d 2L 

* L .-de.<p) - (25T^“’ si “‘«5teSep sin 9 

= ( — l) L c t sin L 0e _ ‘ x,,> . 


(12.184) 


- c t L Jk±^e^sm‘‘e 1 ^ c=XI sm^, (12.185) 


(12.186) 


Note the characteristic ( — l) L phase of i// L> _ L relative to i// L L . This ( — 1) L enters 
from 


sin 2t 0 = (1 — x 2 ) L = (— l) L (x 2 — 1) L . 


(12.187) 


Combining Eqs. 12.182, 12.184, and 12.186, we obtain 


^ m (B, <p) = (- ~ V L+M e‘ M « sin-e *&>■ 


Equations 12.185 and 12.188 agree that 

1 


<P) ~ C L 


jsin 2L e. 


V(2Lj! (rfcos 8) 
Using Rodrigues’s formula, Eq. 12.65, we have 

*lo(0, <?) = (- l) L c L 4^=P L (cos 0) 




= (-l) 


,L c L 


2 L ± 1 
4n 


P L (cos0). 


(12.188) 


(12.189) 


(12.190) 


The last equality follows from Eq. 12.181. We now demand that i// IO (0, 0) be 
real and positive. Therefore 



(-l) L |c L | = (-l) L 


V( 2Lj! 

2 2 L! 


f 2L + 1 

471 


(12.191) 


With ( — l^c^/jc^l = 1, i l/ LO (0,<p) in Eq. 12.190 may be identified with the 
spherical harmonic Y°(Q, <p) of Section. 12.6. 

When we substitute ( — l) L c L into Eq. 12.188, 
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(p) 


Vggg / 2L + 1 / (L-M)! L+M 

2 l L! -y 4 tt \](2LJUL + M)V 

JL+M 

• e‘ M,> sin M 0 — — ttt? sin 2L 0 

d(cos 0) L+M 


/ 2L + 1 / (L-M)! iM M 
V 4 tc \J (L + M)\ K ’ 

' {^L! (1 “ x2)M/2 Sx 1 ^* 2 ~~ 1)L }’ * = C ° S 9 ’ 


(12.192) 


M > 0. 


The expression in the curly bracket is identified as the associated Legendre 
function, (Eq. 12.144), and we have 

~ Y^(0 9 (p) 

I 2L + J ' (L — M V (12193) 

= M - 0> 

in complete agreement with Section 12.6. Then by Eq. 12.81a, Y ^ for negative 
superscript is given by 

Y L ~ M (e, cp) = (- Vf*Y*' *(0, cp). (12.194) 

Our angular momentum eigenfunctions ^ L m(0><p) are identified with the 
spherical harmonics. The phase factor ( — 1) M is associated with the positive 
values of M and is seen to be a consequence of the ladder operators. 

Our development of spherical harmonics here may be considered a portion 
of Lie algebra — related to group theory — Section 4.10. 


EXERCISES 


12.7.1 Show that 

(a) [J + , J 2 ] = 0, 

(b) [J_,J 2 ] = 0. 

1 2.7.2 Using the known forms of L+ and L_ (Exercises 2.5.14 and 12.6.7), show that 

f Y“*L-(L + yf)<*n = f(L + yf)*(L + Y?)da 


12.7.3 Derive the relations 


(a) i^ LM (0, <p) 

(b) ik LM (6,<p) 


/ (L + M)\ 
f (2L)!(L — M)\ 

~ (L-M)! ~ 
(2L)!(L + M)\ 


(L-) L <p), 

(L + ) i+4 V L ,_ t (0,<p). 


1 2.7.4 Derive the multiple operator equations 
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12.7.5 

12.7.6 


(a) (L + )"© LM (0)e‘ M<l> = (— l/e i<M+ ^sin" +M 9—^—sm~ M 9 ® LM (9), 

dyCOS 0) 

(b) (L- f® LM (0)e iM ' 1 ’ = sin"~ M 6 - — — sin M 0 © LM (0). 

d(cos 0) fl 


Hint. Try mathematical induction. 

Show, using that 

Y[ M (9,cp) = (- l) M Y“*(9,<p). 
Verify by explicit calculation that 


(a) L + y°(0, q>) = - /^-sin = JlY\(9, <p), 


(b) L_ y?(0,</>) = + J—smBe-'* = y j2Yl 1 (9,<p). 

The signs (Condon- Shortley phase) are a consequence of the ladder operators, 
L + and L_ . 


12.8 THE ADDITION THEOREM FOR SPHERICAL 
HARMONICS 

Trigonometric Identity 

In the following discussion (6 u <p ^) and {0 2 , (p 2 ) denote two different directions 
in our spherical coordinate system, separated by an angle y. (Fig. 12.16). These 


Zi 



FIG. 12.16 
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angles satisfy the trigonometric identity 

cos y = cos 0 X cos 0 2 + sin 0 1 sin 0 2 cos ((p x — <p 2 ), (12.195) 

which is perhaps most easily proved by vector methods (compare Chapter 1). 
The addition theorem, then, asserts that 

4-TT y* 

P n (™sy) = -^— £ (-1 (12.196) 

2n + 1 

or equivalently, 

n(cosy) = -^- ^ (12.197) 

Zft *t i m= — « 

In terms of the associated Legendre functions the addition theorem is 

P„( cosy) = P„(cos0 l )P n (cos0 2 ) 

_ (12.198) 

+ 2 X " p;'(cos 0, )P„ m (cos 0 2 ) cos m(<p, - <p 2 ). 

Equation 12.195 is a special case of Eq. 12.198. 


Derivation of Addition Theorem 

We now derive Eq. 12.197. Let g(6, cp) be a function that may be expanded in 
a Laplace series 


9(0 l9 (p x ) = Y?(0i,<Pi) relative to x 1? y u 

n 

= Z a nmYn m (y^) relative to x 2 , y 2 , z 2 . 

m= — » 


(12.199) 


Actually the choice of the 0 of the azimuth angle i j/ is irrelevant. At y = 0 we have 

0(0i,S»i)|,-o = *„<> (^^) 1/2 ’ (12-200) 

since P„(l) = 1, whereas P„ m ( 1) = 0 (m 0). Multiplying Eq. (12.199) by E„°*(y, ip) 
and integrating over the sphere, we obtain 

| giOuVtWiV'MdQr'i = a n0 . ( 12 . 201 ) 

Now, using Eq. 12.199, we may rewrite Eq. 12.201 as 

j Y^cpJY^y, <P)dQ = a n0 . (12.202) 

As for Eq. 12.199, we assume that P„(cosy) has an expansion of the form 


P„(cosy)= £ b„ m Y n m (e u(Pl ), (12.203) 

m=—ti 


The asterisk may go on either spherical harmonic. 
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where the b nm will, of course, depend on 6 2 ,(p 2 , that is, on the orientation of the 
z 2 -axis. Multiplying by Y^(0 x ,(p x ) and integrating with respect to and <p x 
over the sphere, we have 

| P n (cos 7) , <Pi) dn 6i ' Vi = b nm . (12.204) 


In terms of spherical harmonics Eq. P ^04 becomes 


4n 


2n + 1 


1/2 


K°(y>'/') Y n m *(&u<Pi)dn = b nm , 


(12.205) 


Note that the subscripts have been dropped from the solid angle element dSl. 
Since the range of integration is over all solid angles, the choice of polar axis 
is irrelevant. Then in a comparison of Eqs. 12.202 and 12.205, 


Km = a n0 


4n 


2n+ 1 


1/2 


471 


2 n + 


jgr(0i, «Pi)| y =o by Eq. 12.200 


(12.206) 


47C 

2n + 1 


Y n m (0 2 ,<P2) 


by Eq. 12.199. 


The change in subscripts occurs because 


9 X — ► 9 2 
<Pl^><P2 


for y 0. 


Substituting back into Eq. 12.203, we obtain Eq. 12.197, thus proving our 
addition theorem. 

The reader familiar with group theory will find a much more elegant proof 
of Eq. 12.197 by using the rotation group. 2 This is Exercise 4.10.11. 

One application of the addition theorem is in the construction of a Green’s 
function for the three-dimensional Laplace equation in spherical polar co- 
ordinates. If the source is on the polar axis at the point (r = a, 9 — 0, <p = 0), 
then by Eq. 12.4 


R 


1 

l r - k «l 


E P r,(cosy)~, 
« = 0 r 

t p n(c°sy)-^, 

n = 0 a 


r > a 


r < a. 


(12.207) 


Rotating our coordinate system to put the source at (a, 9 2 , cp 2 ) and the point of 
observation at (r, 9 l9 (p 1 \ we obtain 


2 Compare M. E. Rose, Elementary Theory of Angular Momentum . New York : 
Wiley (1957). 
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G{r,6 1 ,cp 1 ,a,6 2 ,(p 2 ) = — 


ton A_ n n 

= 1 I r>a, 

n = 0 m =-n Zn 1 r 


co m ^ r « 

= Z I 5^ t 17*(0 1 ,^)W 2 ,92) ? tt. 

n = 0 m— -n Z ' 2 "T 1 a 


r < a. 

(12.208) 

In Section 16.6 this argument is reversed to provide another derivation of the 
Legendre polynomial addition theorem. 


EXERCISES 


1 2.8.1 In proving the addition theorem, we assumed that Y^(6 1 ,<p 1 ) could be expanded 
in a series of Y M m (0 2 ,<p 2 ) in which m varied from — n to +« but n was held fixed. 
What arguments can you develop to justify summing only over the upper index 
m and not over the lower index n? 

Hints. One possibility is to examine the homogeneity of the Y n m , that is, Y n m 
may be expressed entirely in terms of the form cos" _p 0sin p # or x n ~ p ~ s y p z s /r n . 
Another possibility is to examine the behavior of the Legendre equation [V 2 + 
n(n + l)/r 2 ]I > „(cos0) = 0 under rotation of the coordinate system. 


1 2.8.2 An atomic electron with angular momentum L and magnetic quantum number 
M has a wave function 

Show that the sum of the electron densities in a given complete shell is spherically 
symmetric; that is, XM=- L ^*(r,0,<p)iA(r,0,<p) is independent of 0 and <p. 


1 2.8.3 The potential of an electron at point r e in the field of Z protons at r p is 


9 = - 


Show that this may be written as 


1 


47te 0 P % \r e ~ r p 




where r e > r p . How should <p be written for r e <r p ? 


12.8.4 Two protons are uniformly distributed within the same spherical volume. If the 
coordinates of one element of charge are (r 1 ,Q 1 ,<p 1 ) and the coordinates of the 
other are (r 2 , 0 2 , q> 2 ) and r 12 is the distance between them, the element of energy 
of repulsion will be given by 


^ d v 2 _ ^2 r \ & r \ s ^ n #1 ^9l r 2 d r 2 Sitl ^2 ^2 d<p2 

~ P r 12 ~ P r, 2 


Here p = = ^|r, charge density, 
volume 47 zR* 

r h = r i + r i — 2r 1 r 2 cosy. 
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Calculate the total electrostatic energy (of repulsion) of the two protons. This 
calculation is used in accounting for the mass difference in “mirror” nuclei, such 
as O 15 and N 15 . 


ANS. For r 2 > r x 


r 2 <r t 


3*5 

5 R 

3 

5 Rj 


(total). 


This is double that required to create a uniformly charged sphere because we have 
two separate cloud charges interacting, not one charge interacting with itself 
(with permutation of pairs not considered). 


1 2.8.5 Each of the two Is electrons in helium may be described by a hydrogenic wave 
function 


/ Z 3\l/2 

r) = (— 3) e~ Zrl °o 
\nalj 

in the absence of the other electron. Here Z, the atomic number, is 2. The symbol 
a 0 is the Bohr radius, h 2 lme 2 . Find the mutual potential energy of the two electrons 
given by 

ij/*(r l )il/*(T 2 )—il/(r l )tl/(r 2 )d i r 1 d i r 2 . 

J r 12 


Note . d 3 r t = r\ dr 1 sin^ d6 t d<p ly 
r i 2 = | r i -r 2 |. 


ANS. 


5e 2 Z 
8 u 0 


12.8.6 The probability of finding a Is hydrogen electron in a volume element r 2 dr 
sin 6 dOdcp is 


exp [ - 2r/a 0 ] r 2 dr sin 6 d6 d(p. 
7ta 0 


12.8.7 


Find the corresponding electrostatic potential. Calculate the potential from 


V(Ti) = 




with r t not on the z-axis. Expand r 12 . Apply the Legendre polynomial addition 
theorem and show that the angular dependence of F^) drops out. 


A hydrogen electron in a 


ANS. 



2 p orbit has a charge distribution 


p = —~~^r 2 e r/a ° sin 2 6, 

where a 0 is the Bohr radius, h 2 /me 2 . Find the electrostatic potential corresponding 
to this charge distribution. 


1 2.8.8 The electric current density produced by a 2 p electron in a hydrogen atom is 


J = <Po 


& 

32mUo 


e r(a °r sin 0. 
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Using 

find the magnetic vector potential produced by this hydrogen electron. 

Hint. Resolve into cartesian components. Use the addition theorem to eliminate 
y , the angle included between r x and r 2 . 

1 2.8.9 (a) As a Laplace series and as an example of Eq. 9.80 (now with complex func- 
tions), show that 

«(Pi - o*) = I yno 2 ,<P2) rrw , , <p,). 

n,m 

(b) Show also that this same Dirac delta function may be written as 
d(fi 1 -Q 2 ) = ^^-tlp„(cosy). 

Now, if you can justify equating the summations over n term by term , you have 
an alternate derivation of the spherical harmonic addition theorem. 


12.9 INTEGRALS OF THE PRODUCT OF THREE 
SPHERICAL HARMONICS 

Frequently in quantum mechanics we encounter integrals of the general form 

or | 

in which the integration is over all solid angles. The first factor in the integrand 
may come from the wave function of a final state and the third factor from an 
initial state, whereas the middle factor may represent an operator that is being 
evaluated or whose “matrix element” is being determined. 

By using group theoretical methods, as in the quantum theory of angular 
momentum, we may give a general expression for the forms listed. The analysis 
involves the vector-addition or Clebsch-Gordan coefficients, which have been 
tabulated. Three general restrictions appear. 1 (1) The integral vanishes unless the 
vector sum of the L’s (angular momentum) is zero, \L X — L 3 | < L 2 < L x + L 3 . 
(2) The integral vanishes unless M 2 + M 3 = M x . Here we have the theoretical 
foundation of the vector model of atomic spectroscopy. (3) Finally, the integral 
vanishes unless the product Y***Y* 2 YM* is even, that is, unless L t + L 2 + L 3 
is an even integer. This is a parity conservation law. 

Details of this general and powerful approach will be found in the references. 


1 E. U. Condon and G. H. Shortley, The Theory of Atomic Spectra. Cambridge : 
Cambridge University Press (1951) ; M. E. Rose, Elementary Theory of Angular 
Momentum. New York: Wiley (1957); A. Edmonds, Angular Momentum in 

Quantum Mechanics. Princeton, N.J.: Princeton University Press (1957); 
E. P. Wigner, Group Theory and Its Applications to Quantum Mechanics 
(translated by J. J. Griffin). New York: Academic Press (1959). 
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The reader will note that the vector-addition coefficients are developed in terms 
of the Condon-Shortley phase convention in which the (— l) m of Eq. 12.146 is 
associated with the positive m. 

It is possible tp evaluate many of the commonly encountered integrals of this 
form with the techniques already developed. The integration over azimuth may 
be carried out by inspection. 

( 2 "e-‘**e ai *e ot *d<p = (12.209) 

Jo 

Physically this corresponds to the conservation of the z-component of angular 
momentum. 


Application of Recurrence Relations 
A glance at Table 12.4 will show that the 0-dependence of Y L ^ 2 , that is, 
Pl 2 2 (6) can be expressed in terms of cos 0 and sin 0. However, a factor of cos 0 
or sin 0 may be combined with the 3 factor by using the associated Legendre 
polynomial recurrence relations. For instance, from Eqs. 12.85 and 12.86 we get 


cos 0 Y[ 


' = +p 


(L — M + 1 )(L + M + 1) ' 
(2 L + 1)(2L + 3) 


1/2 


\rM 
*L + 1 


~(L-i 
(2 L - 


1/2 


e iv sin 0Y, M 


M)(L + M) 

1)(2 L + 1) 

f + 1 )(L + 

(2 L + 1)(2L + 3) 


Y, m _, 


(L + M+1) (L + M + 2)~j 1/2 vM + i 


+ 


"(L - M)(L - M - 1) 
(2L - 1)(2L + 1) 


1/2 




-[ 

! ] 

R L-M + 1)(L - M + 2) 
“ + L (2L+l)(2L + 3) 


J L + 1 


vM + 1 


1/2 


vM-1 

J L-fl 


(L + M)(L + M-1)"| 1/2 M _, 

I L~ 1 


(2 L - 1)(2L + 1) 

Using these equations, we obtain 

f Y">*coseY“<m - r fr-M+i)(£ + M + D T 2 g g 

J L > (2 L + 1)( 2L + 3) J 

RL - M)(L + M) l 1/2 

[(2L - 1)(2L + 1)J 


( 12 . 210 ) 


( 12 . 211 ) 


( 12 . 212 ) 


(12.213) 


1* 


The occurrence of the Kronecker delta {L^L ± 1) is an aspect of the conserva- 
tion of angular momentum. Physically, this integral arises in a consideration of 
ordinary atomic electromagnetic radiation (electric dipole). It leads to the 
familiar selection rule that transitions to an atomic level with orbital angular 
momentum quantum number L t can originate only from atomic levels with 
quantum numbers L x — 1 or L i 4- 1. The application to expressions such as 
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quadrupole moment ~ 


J 


Yl / *P 2 (cos0)Yfdn 


is more involved but perfectly straightforward. 


EXERCISES 

12.9.1 Verify 


(a) jY^Y°Y^*dQ = ~= 

(b) |y i M y 1 °y«?rfQ = 

jyMyl 

Jyfy.'i 


(C) 


tY L M + + i l *dn= i~ 


3_ UL + M + 1)(L -M+ 1) 
W (21. 4- 1)(2L + 3) 

T /(L + M + 1)(L + M + 2) 


(d) 


y i "t u <ffi= - — 


(2 L + 1)(2L + 3) 

3 /(L - M)(L — JVf — 1) 


8rt\/ (2 L - 1)(2L + 1) 


These integrals were used in an investigation of the angular correlation of internal 
conversion electrons. 


12.9.2 Show that 


(a) 


£ 


2 (L + 1) 


xP L (x)P N (x)fix = < 


(2L + 1)(2L + 3)’ 
2 L 


(2 L - 1)(2L + 1)’ 

2(L + 1)(L 4- 2) 


(b) 


I? 




(2L + 1)(2L + 3)(2L + 5)’ 
2(2 L 2 + 2L - 1) 

(2L - 1)(2L + 1)(2L + 3)’ 
2L(L - 1) 

(2 L - 3)(2L - 1)(2L + 1)’ 


JV = L + 1, 
iV = L - 1. 

N = L + 2, 
N — L, 

JV = L - 2. 


1 2.9.3 Since xP„(x) is a polynomial (degree n + 1), it may be represented by the Legendre 
series 


xPJx) = £ a s P s (x). 

5=0 

(a) Show that a s = 0 for s < n — 1 and s > n + 1. 

(b) Calculate a n ~ 1 , a n , and a n+1 and show that you have reproduced the recur- 
rence relation, Eq. 12.17. 

Note. This argument may be put in a general form to demonstrate the existence 
of a three-term recurrence relation for any of our complete sets of orthogonal 
polynomials: 


X<Pn = a n+l <Pn+l + a n<Pn + a n- l<Pn-V 
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12.10 LEGENDRE FUNCTIONS OF THE SECOND 
KIND, Q n (x) 

In all the analysis so far in this chapter we have been dealing with one solution 
of Legendre’s equation, the solution P n (cos 0), which is regular (finite) at the two 
singular points of the differential equation, cos 0 — ± 1. From the general theory 
of differential equations it is known that a second solution exists. We develop 
this second solution, Q n , by a series solution of Legendre’s equation. Later a 
closed form will be obtained. 


Series Solutions of Legendre's Equation 

To solve 

+ n{n + l)y = 0 (12.214) 

we proceed as in Chapter 8, letting 1 

y = I (12-215) 

A = 0 

with 

/ — £ (fc + X)a ; x k+x ~ 1 , (12.216) 

A = 0 

y" = £ (k + A)(k + X - l)a x x k+x - 2 . (12.217) 

A = 0 

Substitution into the original differential equation gives 

£ (* + *)(k + A - 1 )a x x k+x ~ 2 

A = 0 

4- f [w(n + 1) - 2(k + 2) - (k + A)(k + A - l)K* k+A = 0. (12.218) 

A = 0 

The indicial equation is 

k(k- 1) = 0, (12.219) 

with solutions k = 0, 1. We try first k = 0 with a 0 = 1, =0. Then our series is 

described by the recurrence relation 

(A + 2) (A + 1 )a x+2 + [n(« + 1) - 2A - A(A - l)]a, = 0, (12.220) 

which becomes 


A 

dx 


(1 - X 2 ) 


dy 

dx 


Note that x may be replaced by the complex variable z. 
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__ ( n + A + l)(n — 2) „ 

“ x+2 ~ (A + 1)(A + 2)- a * ' 

Labeling this series p n , we have 

/>„<*) - 1 _ife ? tl)^ + C-2)-(-+ 1) <"+ V + . . .. 


( 12 . 221 ) 


(12.222) 


The second solution of the indicial equation, k = 1, with u 0 = 1, ^ = 0, leads 
to the recurrence relation 


*A+2 ~ 


(n -+■ A 2)(n — / — 1) 


(A + 2){X + 3) 


Labeling this series q„, we obtain 


qn(x) = x (»~ 1)(» + 2) ^,3 [ (h - 3)(n - l)(w + 2 )(n + 4) j;5 


(12.223) 


. (12.224) 


Our general solution of Eq. 12.214, then, is 

y„(x) = A nPn (x) + B„q n (x), (12.225) 


provided we have convergence . From Gauss’s test, Section 5.2 (see Example 5.2.4), 
we do not have convergence at x = ± 1. To get out of this difficulty, we set the 
separation constant n equal to an integer (Exercise 8.5.5) and convert the infinite 
series into a polynomial. 

For n, a positive even integer (or zero), series p n terminates, and with a proper 
choice of a normalizing factor (selected to obtain agreement with the definition 
of P n (x) in Section 12.1) 


w = { - ir ' 2 nmr^ x) 


= (-!)• 


(25)! 

2 2s (s !) 


2P2M) = (- 1 ) : 


(2s- 1)!! 
(2s)!! 


Pzsix), 


(12.226) 

for n = 2s. 


If n is a positive odd integer, series q n terminates after a finite number of terms, 
and we write 


p m {x) = (-iy 


,(n-lV2_ 


n\ 


2"- 1 {[(«- l)/2]!} 2 




, , w (2s + 1)! ,, , 1w (2s + l)!! 

= 2 5 (Fij r92,+l(x ) = (_1) (2s)!! q2s+t{x} ’ 


for n = 2s + 1. 

(12.227) 


Note that these expressions hold for all real values of x, — oo < x < 00, and for 
complex values in the finite complex plane. The constants that multiply p n and 
q n are chosen to make P n agree with Legendre polynomials given by the generat- 
ing function. 

Equations 12.222 and 12.224 may still be used with n = v, not an integer, but 
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now the series no longer terminates, and the range of convergence becomes 
— 1 < x < 1. The end points, x = + 1 are not included. 

It is sometimes convenient to reverse the order of the terms in the series. This 
may be done by putting 


n 

s = - — X in the first form of P n (x), n even, 

5 = — — — - — X in the second form of P„(x), n odd, 


so that Eqs. 12.230 and 12.231 become 


[n/2] 


P n (x) = X (-1 ) V ,/ 2n .,f >! 

s =o 2 s!(m — s)!(n — 2s)! 


.n-2s 


(12.228) 


where the upper limit s = n/2 (for n even) or ( n — l)/2 (for n odd). This reproduces 
Eq. 12.8 of Section 12.1, which is obtained directly from the generating function. 
This agreement with Eq. 1 2.8 is the reason for the particular choice of normaliza- 
tion in Eqs. 12.226 and 12.227. 


Q n (x), Functions of the Second Kind 
It will be noticed that we have used only p n for n even and q n for n odd (be- 
cause they terminated for this choice of n). We may now define a second solution 
of Legendre’s equation (Fig. 12.17) by 
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FIG. 12.18 Second Legendre function, 
Q„(x), x > 1 


Q„(x) = ( - 1 y/ 2 ^”/ 2 )-] 2 qn ( x ) = ( _ for n even, n = 2s, 

(12.229) 


n\ 




= ( - 1) 5+1 (2^+ iju Pzs+iM’ for n odd ’ M = 2s + f- 


(12.230) 


This choice of normalizing factors forces Q n to satisfy the same recurrence rela- 
tions as P„. This may be verified by substituting Eqs. 12.229 and 12.230 into 
Eqs. 12.17 and 12.26. Inspection of the (series) recurrence relations (Eqs. 12.221 
and 12.223), that is, by the Cauchy ratio test, shows that Q n (x) will converge for 
— 1 < x < 1. If |x| > 1, these series forms of our second solution diverge. A 
solution in a series of negative powers of x can be developed for the region 
|x| > 1 (Fig. 12.18) but we proceed to a closed form solution that can be used 
over the entire complex plane (apart from the singular points x = ± 1 and with 
care on cut lines). 


Closed Form Solutions 

Frequently, a closed form of the second solution, Q n (z), is desirable. This may 
be obtained by the method discussed in Section 8.6. We write 

+ (12.23,, 

in which the constant A n replaces the evaluation of the integral at the arbitrary 
lower limit. Both constants, A n and B n , may be determined for special cases. 
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For n = 0, Eq. 12.231 yields 


Q o( Z ) ~ P o( z ) 



dx 


(1 -x 2 )[P 0 (x)] : 


— A 0 + B 0 -\n 


1 + z 
1 — z 


~ A 0 + B 0 



+ 


-2s+ 1 


25+ 1 



(12.232) 


the last expression following from a Maclaurin expansion of the logarithm. 
Comparing this with the series solution (Eq. 12.224), we obtain 


-3 -5 2 ^ + 1 

6o( z ) = Qo( z ) = z + y + y + ' ' * 2 S + 1 + * * * ’ (12.233) 


we have A 0 = 0, B 0 = 1. Similar results follow for n = 1. We obtain 

dx 


Qi(z) = z 


A i + 


1 + 


x 2 )x 2 


. , „ /l. l+z 1 

— A,z + P.zi -In- 

\ 2 1-2 2 


(12.234) 


Expanding in a power series and comparing with (?,(z)= — p t (z), we have 
A x = 0, B l = 1. Therefore we may write 


t \ 1 1 1 + Z 

£>o z) = ~ln- 

2 1 — z 

(12.235) 

e,(z) = |z In |^-1, 1 2 1 < 1. 


Perhaps the best way of determining the higher-order Q n {z) is to use the 
recurrence relation (Eq. 12.17), which may be verified for both x 2 < 1 and for 
x 2 > 1 by substituting in the series forms. This recurrence relation technique 
yields 

Cz(z) = ^2(2)ln| ±i -|p 1 (z). (12.236) 

Repeated application of the recurrence formula leads to 

- - (12.237) 

From the form In [(1 + z)/( 1 — z)] it will be seen that for real z these expres- 
sions hold in the range — 1 < x < 1. If we wish to have closed forms valid outside 
this range, we need only replace 


In 


1 + x 


by In 


z + 1 

z — r 


1 — X 
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When using the latter form, valid for large z, we take the line interval — 1 < x < 1 
as a cut line. Values of Q n {x), on the cut line, are customarily assigned by the 
relation 


Qn(x) = | [Q n (x + i 0) + Q n (x - ;0)], (12.238) 

the arithmetic average of approaches from the positive imaginary side and from 
the negative imaginary side. The reader will note that for z -► x > 1, z — 1 -► 
(1 — x)e ±in . The result is that for all z, except on the real axis — 1 < x < 1, 
we have 

Q 0 (z) = Un Z -±±, (12.239) 

1 z — 1 

Q l (zj = \z\n z -±±- 1, (12.240) 

l Z — 1 

and so on. 


For 

1. 

2. 

3. 

4. 

5 . 


convenient reference some special values of Q n (z ) are given. 
Q n ( 1) = oo, from the logarithmic term (Eq. 12.237). 

Q n (o o) = 0. This is best obtained from a representa- 
tion of Q n (x ) as a series of negative powers of x, 
Exercise 12.10.4. 

Q n ( — z) = ( — 1 ) n+1 Q n (z). This follows from the series 
form. It may also be derived by using Q 0 (z), Q 1 (z) and 
the recurrence relation (Eq. 12.17). 

Q„(0) = 0, for n even, by (3). 

Q n (0) = ( — l)<" + D/2 {[( n - W 2 ] ! } 2 2 "- 1 


, n , + i (2s)!! 

V ’ (2s + !)!!’ 


for n odd, n = 2s + 1. 


This last result comes from the series form (Eq. 12.230) with p„(0) = 1. 


EXERCISES 


1 2.1 0.1 Derive the parity relation for Q n (x). 

12.10.2 From Eqs. 12.226 and 12.227 show that 


(a) P 2 




(b) P u+l (x) 


(_-ir 

2 s =o 

(- 1 )" 


(2n + 2s — 1)! 


Z (-If 


(2s) !(n + s — 1) !(n — s) ! 
(2n + 2 s+ 1)! 


(2s + 1)!(m + s)!(n — s) ! 


Check the normalization by showing that one term of each series agrees with 
the corresponding term of Eq. 12.8. 
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12.10.3 Show that 


(a) Q 2 „(x) = (- 1)"2 2 " Y (- l) s + — ^-x 2s+1 

2 “ 0 (2s + l)!(2w - 2s)! 

2 „ £ + **+*, | x | < 1. 

,-r + i (2 s+1)!(s-b)! 11 

(b) Q 2 „ +1 (x) = (-l)" +1 2 2n Y (-1) 5 (” + 5 ) ! (”~ 5 ) ! x 2s 

* 2 " 1 s r 0 (2s)!(2« -2s+ 1)! 

. , 2 „+i y (n + s)!(2s-2n-2)! 2 , . , . 

s =7 + i (2s)!(s — « — !)! ’ 11 


1 2.1 0.4 (a) Starting with the assumed form 

Q„{x) = y b. x x k -\ 

A = 0 

show that 


2„(x) = b 0 x " 1 


I 


s = 0 


{n + 5) !(m + 2s) !(2h 4- 1) ! _ 2s 
s!(n!) 2 (2n + 2s+ 1)! X 


(b) The standard choice of b 0 is 


^0 


2 n (n !) 2 
( 2 « + 1 )!’ 


Show that this choice of b 0 brings this negative power series form of G„(x) 
into agreement with the closed form solutions. 


12.10.5 Verify that the Legendre functions of the second kind, Q„(x), satisfy the same 
recurrence relations as P n (x\ both for jx[ < 1 and for |x| > 1. 

(2 n + l)xQ„(x) = [n + l)G„ +1 (x) -I- nQ^^x), 

(2 n + i)G„(x) = Q„ + i(x) - e;_i(x). 

12.10.6 (a) Using the recurrence relations, prove (independently of the Wronskian 

relation) that 

n[P„(x)Q n - l M - P n -i(x)Q n (x)] = PiWQoix) - P 0 (x)Q 1 (x). 

(b) By direct substitution show that the right-hand side of this equation 
equals 1. 


1 2.1 0.7 (a) Write a subroutine that will generate G„(x) and lower index Q’s based on 
the recurrence relation for these Legendre functions of the second kind. 
Take x to be within (—1, 1) — excluding the end-points. 

Hint. Take g 0 (x) and Qj(x) to be known. 

(b) Test your subroutine for accuracy by computing Q 10 (x) and comparing 
with the values tabulated in AMS-55 (Chapter 8). 


12.1 1 VECTOR SPHERICAL HARMONICS 

Most of our attention in this chapter has been directed toward solving the 
equations of scalar fields such as the electrostatic field. This was done primarily 
because the scalar fields are easier to handle than vector fields ! However, with 
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scalar field problems under firm control, more and more attention is being paid 
to vector field problems. 


Magnetic Field of a Current Loop 

To illustrate the difficulties, let us consider the equation 1 


V x V x A = fi 0 J 


(12.241) 


for the magnetic vector potential. Let us further suppose that the boundary 
conditions are best expressed in spherical polar coordinates. In the example of 
a current loop (Section 12.5) it was possible to handle this equation because 
the form of A was highly restricted. In general, this equation will yield three 
scalar equations, each involving all three components of A , A r , A e , and A r 
Such coupled differential equations can be solved, but the complexities are 
formidable. 

Setting V • A = 0, we can convert our equation into the vector Laplacian 
V 2 A. This will separate into one equation for each component in cartesian 
coordinates. Unfortunately, our boundary conditions (for the current loop) 
are in spherical coordinates. To satisfy them we would still have to mix the 
cartesian components A x , A y , and A z in a form that would probably be both 
awkward and difficult to handle. 

To facilitate the solution of Eq. 12.241 and other equations, such as the 
vector Helmholtz and the vector wave equation, we have used various com- 
binations of the (scalar) spherical harmonics to construct vectors in spherical 
polar coordinates. One set, useful in quantum mechanics, has been described 
by Hill. 2 His three vector spherical harmonics are 



+ 0o 


dY, M 


[(l + i)(2L + i)] 1/2 ee 


+ <Po 


iM 


[(L + 1)(2L + 1)] 1/2 sind 


\rto 

I L 


(12.242) 


W LM = r 0 


2 L + 1 


1/2 


+ 00 




+ <Po 


iM 


[L(2L + l)] 1/2 sin0 


— M 


[L(2L + 1)] 1/2 dd 


Y, m 


— i 


X, .M 00 ||- L(L + 1} ] 1/2 sin Q Y L I + <P0 j|- L(L + j } ] 1/2 QQ | • 
These functions satisfy a general orthogonality relation 


dY, M 


(12.243) 


(12.244) 


1 Compare Exercise 1.14.5 for a derivation from Maxwell’s equations. 

2 E. H. Hill, “Theory of Vector Spherical Harmonics,’’ Am . J. Phys. 22 , 211 
(1954); also J. M. Blatt and V. Weisskopf, Theoretical Nuclear Physics. 
New York: Wiley (1952). Note that Hill assigns phases in accordance with 
the Condon-Shortley phase convention (Section 12.6). 
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J (12.245) 

where A and B may be V, X, or W. This may be verified by using the definitions 
of V, X, and W and reducing the integral to one of ordinary orthonormal 
spherical harmonics, Y L M (0, cp). 

Under the parity operations (coordinate inversion) the vector spherical 
harmonics transform as 


where 


V LM (^^) = (-l) L+1 V LM (fl^), 


9' — n — 0 
(p' = n + (p. 


(12.246) 


(12.247) 


In verifying these relations, the reader should remember that the spherical polar 
coordinate unit vectors r 0 and cp 0 are odd and 0 O is even. These properties may 
be verified by expressing the unit vectors r 0 , 0 O , and <p 0 in terms of the cartesian 
unit vectors i, j, and k and spherical polar coordinates. 

To demonstrate the use of the vector spherical harmonics, consider Eq. 12.241 
again. From Hill’s table of differential relations 

v • iF(r)\ LM (0, <p)3 = - Y“{9, <p), (12.248) 

V • [F(r)W LM (0, <?)] = ( 2 ZTiJ 2 \j£r - (12. 249) 

V-[F(r)X LM (0,<p)] = 0. (12.250) 

The condition 


V • A = 0 (12.251) 

eliminates \ LM and W LM , leaving only \ LM . In the absence of current (J = 0), 
that is, away from the current loop, Eq. 12.241, subject to Eq. 12.251, becomes 

V 2 A = 0. (12.252) 


Using another Hill differential relation with A 
V 2 [R(r)X LM (0,<p)] = 

in agreement with our Eq. 12.113. We have 


d 2 R 2dR 
dr 2 + r dr 


LM — R {r)\ LM {0,q>), we obtain 
- UL r t - = 0, (12.253) 


^LM — a LM r L l X LM {0,(p). (12.254) 

We note that there can be no azimuthal dependence because of the symmetry 
of our loop, M = 0, and our solution reduces to 



710 LEGENDRE FUNCTIONS 


W ;'■)]« (12255) 

This is equivalent to Eq. 12.116. The constants a L are determined by fitting 
boundary conditions, as done in Section 12.5 for c„. The magnetic field may be 
found from 


V x 



(12.256) 


which corresponds to Eq. 12.119. [Here F(r ) = a L r _L_1 .] 

The definitions of the vector spherical harmonics given here are dictated 
by convenience, primarily in quantum mechanical calculations, in which the 
angular momentum is a significant parameter. Morse and Feshbach describe 
another set of vector spherical harmonics, B, C, and P, in which the radial 
dependence is entirely in P and the angular dependence entirely in B and C. 
This set offers advantages in treating the wave equation when we want to 
separate the longitudinal and transverse parts of the wave. 

Further examples of the usefulness and power of the vector spherical har- 
monics will be found in Blatt and Weisskopf, in Morse and Feshbach, and in 
Jackson’s Classical Electrodynamics , which uses vector spherical harmonics in 
a description of multiple radiation and related electromagnetic problems. 

Vector spherical harmonics may be developed as the result of coupling L 
units of orbital angular momentum and 1 unit of spin angular momentum. 
An extension, coupling L units of orbital angular momentum and 2 units of 
spin angular momentum to form tensor spherical harmonics, is presented by 
Mathews. 2 The major application of tensor spherical harmonics is in the 
investigation of gravitational radiation. 


EXERCISES 


1 2.1 1 .1 Construct the / — 0, m = 0 and l = 1, m = 0 vector spherical harmonics. 

ANS. V 00 = -r 0 (47r)" I/2 
X 00 = 0 
W oo = 0 

V 10 = — r 0 (2rc) 1/2 cos0-0 o (87t) 1/2 sin 6 

X 10 = <p o K3/8tt) 1/2 sin0 

W 10 = r 0 (47r)“ 1/2 cos 0 — 0 o (4tc) _ 1/2 sin 6. 

12.11.2 Verify that the parity of \ LM is V - l) L+ \ the parity of X LM is (— 1) L , and that of 
W LM is (— 1) A+1 . What happened to the M-de pendence of the parity? 

Hint . r 0 and q> 0 have odd parity; 0 O has even parity (compare Exercise 2.5.8). 


2 J. Mathews, “Gravitational Multipole Radiation,” in H. P. Robertson, In 
Memoriam. Philadelphia: Society for Industrial and Applied Mathematics 
(1963). 
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12 . 11.3 Verify the orthonormality of the vector spherical harmonics \ LM , X LM , and 
W LM . 

1 2.1 1 .4 In Classical Electrodynamics 2nd ed., Jackson defines X LM by the equation 

in which the angular momentum operator L is given by 

L = — /(r x V). 

Show that this definition agrees with Eq. 12.244. 

12 . 11.5 Show that 

t X* M (0,<p).X LM # > <p) = ^ti. 

M— —L ™ 

Hint. One way is to use Exercise 12. 1 1.4 with L expanded in cartesian coordinates 
using the raising and lowering operators of Section 12.7. 

12 . 11.6 Show that 

J^LAr*( r o * X LM )d Q = 0. 

The integrand represents an interference term in electromagnetic radiation 
that contributes to angular distributions but not to total intensity. 


REFERENCES 

Hobson, E. W., The Theory of Spherical and Ellipsoidal Harmonics . New York: Chelsea 
(1955). 

This is a very complete reference, which is the classic text on Legendre polynomials and 
all related functions. 


See also the references listed at the end of Chapter 13. 



13 SPECIAL 
FUNCTIONS 


In this chapter we shall study four sets of orthogonal polynomials, Hermite, 
Laguerre, and Chebyshev 1 of first and second kinds. Although these four sets 
are of less importance in mathematical physics than the Bessel and Legendre 
functions of Chapters 11 and 12, they are used occasionally and therefore 
deserve at least a little attention. Section 13.4 is devoted to important numerical 
applications of Chebyshev polynomials. Because the general mathematical 
techniques duplicate those of the preceding two chapters, the development of 
these functions is only outlined. Detailed proofs, along the lines of Chapters 11 
and 12, are left to the reader. To conclude the chapter, we express these poly- 
nomials and other functions in terms of hypergeometric and confluent hyper- 
geometric functions. 


13.1 HERMITE FUNCTIONS 

Generating Functions — Hermite Polynomials 
The Hermite polynomials (Fig. 13.1), H n (x\ may be defined by the generating 
function 2 


g(x , t) = e 


— 0-t 2 +2tx _ 


= £ **„(*)„, 


«!' 


(13.1) 


Recurrence Relations 

Note the absence of a superscript, which distinguishes it from the unrelated 
Hankel functions. From the generating function we find that the Hermite 
polynomials satisfy the recurrence relations 

H n+1 (x) = 2xH„(x) - 2 nH^x) (13.2) 

and 


Kix) = 2nH„_ l (x). 


(13.3) 


‘This is the spelling choice of AMS-55. However, a variety of forms such as 
T schebyscheff is encountered . 

J A derivation of this Hermite generating function is outlined in Exercise 


13.1.3. 
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Equation 13.2 may be obtained by differentiating the generating function with 
respect to t\ differentiation with respect to x leads to Eq. 13.3. 

Direct expansion of the generating function easily gives H 0 (x) = 1 and 
H l (x) = 2x. Then Eq. 13.2 permits the construction of any H n (x ) desired 
(integral n). For convenient reference the first several Hermite polynomials 
are listed in Table 13.1. 

Special values of the Hermite polynomials follow from the generating 
function ; that is, 


H 2n (Q) = (-ir~ 

(13.4) 

nl 


H 2n+1 (0) = 0. 

(13.5) 


We also obtain from the generating function the important parity relation 

H n (x) = (-lYH n (-x). (13.6) 


Alternate Representations 

Differentiation of the generating function 3 n times with respect to t and then 
setting t equal to zero yields 


H n (x) = (-lTe x2 £- n (e-* 2 ). 


(13.7) 


This gives us a Rodrigues representation of H n (x). A second representation may 
be obtained by using the calculusof residues (Chapter 7). If we multiply Eq. 13.1 
by t~ m ~ l and integrate around the origin, only the term with H m (x) will survive. 


H m (x) 


ml 

2ni 


t -™-i e - t 2 + 2tXdi . 


(13.8) 


3 Rewrite the generating function as g(x , t) = e* 2 e (f x)2 . Note that 

*L e -it -x) 2 

dt dx 
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TABLE 13.1 Hermite 
Polynomials 

«oW = 1 

/f,(x) = 2x 

Hi(x) = 4x 2 - 2 

tf 3 (x) = 8x 3 - I2x 

tf 4 (x) = 16x 4 - 48x 2 + 12 

tf 5 (x) = 32x 5 - 160x 3 + 120x 

// 6 (x) = 64x 6 - 480x 4 + 720x 2 - 120 


Also, from Eq. 13. 1 we may write our Hermite polynomial H„(x ) in series form. 


= (2 xf - 


2n\ 


(n — 2) ! 2 ! 


(2x)"~ 2 + 


4 n! 


l»/2J / „ \ 

= z(-2w-g 


(n — 4) ! 4 ! 

1 . 3 . 5 . . . (2s - 1) 


(2x)"~ 4 l • 3 


(13.9) 


[n/2] 


-2s 


n! 


This terminates for integral n and yields our Hermite polynomial. 


Orthogonality 

The recurrence relations (Eqs. 13.2 and 13.3) lead to the second-order linear 
differential equation 


H;(x) - 2 xH'(x) + 2nH n (x) = 0, (13.10) 


which is clearly not self-adjoint. 

To put Eq. 13. 10 in self-adjoint form, we multiply by exp( — x 2 ), Exercise 9.1.2. 
This leads to the orthogonality integral 



H m (x)H n (x)e~ x2 dx = 0, 


m^n, 


(13.10a) 


with the weighting function exp(-x 2 ) a consequence of putting the differential 
equation into self-adjoint form. The interval (- 00 , 00 ) is selected to satisfy the 
Hermitian operator boundary conditions, Section 9.1. It is sometimes conve- 
nient to absorb the weighting function into the Hermite polynomials. We may 
define 


cp n (x) = e~ x2l2 H n {x) (13.11) 

with <p„(x ) no longer a polynomial. 

Substitution into Eq. 13.10 yields the differential equation for <p„(x), 

<p«(x) + (2n + 1 - x 2 )<p„(x) = 0. (13.12) 

This is the differential equation for a quantum mechanical, simple harmonic 
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oscillator which is perhaps the most important single application of the Hermite 
polynomials. Equation 13.12 is self-adjoint and the solutions (p n (x) are ortho- 
gonal for the interval ( — 00 < x < 00) with a unit weighting function. 

The problem of normalizing these functions remains. Proceeding as in 
Section 12.3, we multiply Eq. 13.1 by itself and then by e~ x \ This yields 


e -, V ^+ 2 .,x e -,> + 2 ,* = £ e-* 2 H m (x)H n (x) S ~. 

mini 


m.n = 0 


(13.13) 


When we integrate over x from — oo to + oo the cross terms of the double sum 
drop out because of the orthogonality property 4 


y (Str 

„ = 0 n\n\ 


e x [H n (x)] 2 dx = e 


-x 2 -s 2 \ 2sx-t 2 +2tx 


cix 


-ix-s-t) 2 0 2st 


e 2st dx 


= 

„=o «! 


By equating coefficients of like powers of st, we obtain 

e~ x2 [H n (x)Y dx = 2 n n i/2 n\. 


(13.14) 


(13.15) 


Quantum Mechanical Simple Harmonic Oscillator 
As already indicated, the Hermite polynomials are used in analyzing the 
quantum mechanical simple harmonic oscillator. For a potential energy 
V = \Kz 2 — jmo) 2 z 2 (force F = — \V = —Kz), the Schrodinger wave equation 
is 

- ~V 2 T(z) + \kz 2 V(z) = E'V(z). (13.16) 

2m 2 


Our oscillating particle has mass m and total energy E . By use of the 
abbreviations 


x = az 


with 


mK 


m 2 v ) 2 


a = V = ~h 2 


/ = — - ( — 

.h l K 


1/2 


2 E 

ho) ’ 


(13.17) 


in which co is the angular frequency of the corresponding classical oscillator, 
Eq. 13.16 becomes [with T'(z) — T^x/a) = (A(x)] 

+ (/. - x 2 W(x) = 0. (13.18) 


4 The cross terms (m j= n ) may be left in, if desired. Then, when the coefficients 
of s*t fi are equated, the orthogonality will be apparent. 



716 SPECIAL FUNCTIONS 



FIG. 13.2 Quantum mechanical oscil- 
lator wave functions : the heavy bar on the 
x-axis indicates the allowed range of the 
classical oscillator with the same total 
energy 


This is Eq. 13.12 with X — 2n + 1. Hence (Fig. 13.2), 

= 2~ nl2 n~ ll4 (nl)~ ll2 e~ x2l2 H„(x) (normalized). (13.19) 

The requirement that n be an integer is dictated by the boundary conditions of 
the quantum mechanical system, 

lim 4 / (z) = 0. 

z~* ± oo 

Specifically, if n -► v, not an integer, a power-series solution of Eq. 1 3. 1 0 (Exercise 
8.5.6) shows that H v (x) will behave as x v e x2 for large x. The functions i j/ v (x) and 
¥ v (z) will therefore blow up at infinity, and it will be impossible to normalize 
the wave function ^(z). With this requirement, energy E becomes 

E = (n + i)fc». (13.20) 

As n ranges over integral values (n > 0), we see that the energy is quantized and 
that there is a minimum or zero point energy 

E « tin = ihco. (13.21) 

This zero point energy is an aspect of the uncertainty principle, a purely quantum 
phenomenon. 


Raising and Lowering Operators 

An alternate treatment of the quantum mechanical oscillator found in many 
quantum mechanics texts employs raising and lowering operators: 
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- ~J\p„(x) = (n + l) 1/2 il/„ +l (x), 
+ -~jil/„(x) = n m (x). 


(13.22a) 

(12.226) 


Often in quantum mechanics the raising operator is labeled a creation operator, 
a\ and the lowering operator an annihilation operator, a. The wave function ijj n 
(actually given by Eq. 13.19) is unknown. The development is similar to the use 
of the raising and lowering operators presented in Section 12.7. The minimum 
energy or ground state wave function, *// 0 , satisfies the equation 

(x + £y°(x) = 0. (12.23) 

Normalized to unity, 

ij/ 0 (x) = 7 z~ 1/4 c* 2/2 , (12.23 a) 


in agreement with Eq. 13.19. The excited state wave functions, \p l , ijj 2 , and so on, 
are then generated by the raising operator — Eq. 1 3.22a. The verification of these 
raising and lowering operators, Eqs. 1 3.22a and 13.226, is left as Exercise 13.1.16. 

In quantum mechanical problems, particularly in molecular spectroscopy, 
a number of integrals of the form 



x2 H n (x)HJx)dx 


are needed. Examples for r = 1 and r = 2 (with n = m) are included in the 
exercises at the end of this section. A large number of other examples are 
contained in Wilson, Decius, and Cross. 5 

The oscillator potential has also been employed extensively in calculations 
of nuclear structure (nuclear shell model). 

There is a second independent solution of Eq. 13.10. This Hermite function 
of the second kind is an infinite series (Sections 8.5, 8.6) and of no physical 
interest, at least not yet. 


EXERCISES 

13 . 1.1 Assume the Hermite polynomials are known as solutions of the differential 
equation (13.10) and from this the recurrence relation, Eq. 13.3, and the values 
of H n { 0) are also known. 

(a) Assume the existence of a generating function 

g(x,t)= I H„(xr/nl. 

n~0 


■''E. B. Wilson, Jr., J. C. Decius, and P. C. Cross. Molecular Vibrations . 
New York: McGraw-Hill (1955). 
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(b) Differentiate g(x, t) with respect to x and using the recurrence relation 
develop a first-order differential equation for g(x , t). 

(c) Integrate with respect to x, holding t fixed. 

(d) Evaluate 0(0, t) using Eqs. 13.4 and 13.5. Finally, show that 

0(x, t) = exp( — t 2 4- 2 tx). 

13.1 .2 In developing the properties of the Hermite polynomials, you could start at a 
number of different points such as: 

1. Hermite differential equations, Eq. 13.10, 

2. Rodrigues’ formula, Eq. 13.7, 

3. Integral representation, Eq. 13.8, 

4. Generating function, Eq. 13.1, 

5. Gram-Schmidt construction of a complete set of orthogonal 
polynomials over ( — 00,00) with a weighting factor of 
exp(— x 2 ), Section 9.3. 

Outline how you can go from any one of these starting points to all the other 
points. 


1 3.1 .3 From the generating function show that 

f"/2] n | 

h„(x)= K-i y ( ”• , (2xr 2s . 

s=0 (n — 2s) !s! 

1 3.1 .4 From the generating function derive the recurrence relations 

H n+i (x) = 2 xH n (x) - 2nH n _ l (x), 

H:(x) = 2nH„_ i (x). 

13.1.5 Prove that 

d\\ 


2x ■ 


dx 


1 = H n (x ). 


Hint. Check out the first couple of examples and then use mathematical induc- 
tion. 


13 . 1.6 Prove that 

\H„(x)\ < 

1 3.1 .7 Rewrite the series form of H n (x), Eq. 13.9, as an ascending power series. 

ANS. H 2n (x) = (- 1)" t (- mix) 2 *: ; .. . 

s % (2i)!(« - s)! 

= (-1)" I (- lf(2x) 2s+ ' — f‘ + 1)! 
s=o (2s + 1) !(« — 


13.1.8 


(a) Expand x 2r in a series of even order Hermite polynomials. 

(b) Expand x 2r+1 in a series of odd order Hermite polynomials 


ANS. (a) 


x2 , = (2r)! ^ H 2 „(x) 

l 2 ’ „%(2n)\(r-n)< 


(b) 


y2r+i (2r + 1) ! ^ 

A -)2r + l Z- 




„=o (2« + l)!(r — n)!’ 


r = 0, 1, 2, . . 


Hint. Use a Rodrigues representation of H ln (x) and integrate by parts. 
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13.1.9 


Show that 


(a) 

(b) 



H n {x) exp[ — a 2 /2] dx 


2nn\/{n/2 ) !, 

a 


'*00 

xH„(x) exp[ — x 2 /2] dx = 

*/ — 00 


0, 

27t — — — — . 
n 4 1 

2~ 


n even 
n odd. 
n even 

n odd. 


13.1.10 Show that 

-*00 

x m e~ x ~H n {x)dx = 0 for m an integer, 0 < m < n — 1. 

J - X 

1 3.1 .1 1 The transition probability between two oscillator states, m and /?, depends on 

/*x 

xe~ x! H„(x)H„,{x)dx. 

J - X 

Show that this integral equals 7r ,/2 2” _1 n! 4 7T 1/2 2 "(m 4 1)! <5 #fKrH1 . 

This result shows that such transitions can occur only between states of adjacent 
energy levels, m = n 4 1. 

Hint. Multiply the generating function (Eq. 13.1) by itself using two different sets 
of variables (aw) and (a,/). Alternatively, the factor x may be eliminated by the 
recurrence relation Eq. 13.2. 


13.1.12 Show that 

x 2 e xl H„(x)U„(x)dx = 7t 1/2 2"n! + t 

This integral occurs in the calculation of the mean-square displacement of our 
quantum oscillator. 

Hint. Use the recurrence relation Eq. 13.2 and the orthogonality integral. 


13.1.13 Evaluate 

roo 

x 2 cxp [~x 2 ]H„(x)HJx)dx 

J - X 

in terms of n and m and appropriate Kronecker delta functions. 

ANS. 2" _J 7r ,/2 (2/7 4 l)n 4 2 n n ll2 (n 4 2) !<)„., 2m + 2 ,,_ 2 n v2 nlS n . 2<m 

13.1.14 Show that 


r fo, 

x'exp[-x 2 ]H„(x)H„ +1 ,tx)</x = < 

J -OC t- 4 - V / *■ 

m, p , and r are nonnegative integers. 

Him. Use the recurrence relation, Eq. 13.2, p times. 


p > r 
P = r. 


13.1 .1 5 (a) Using the Cauchy integral formula, develop an integral representation of 
H n {x) based on Eq. 13.1 with the contour enclosing the point z = — a. 

ANS. Hjx) = ’.4 <f -*-—dz. 
2m J (z 4 a) 

(b) Show by direct substitution that this result satisfies the Hermite equation. 
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13 . 1.16 With 

il/ n (x) = e x2/2 H n (x)/(2 n n 

verify that 

aMx) = -y=^x + = « 1/2 iA„-i(x), 

<%(*) = -^=(x - £)+&) = (» + l) ,/2 ^ +l (x). 

Note. The usual quantum mechanical operator approach establishes these 
raising and lowering properties before the form of ij/ n (x) is known. 

1 3.1 .1 7 (a) Verify the operator identity 

X - — = — exp[x 2 /2] exp [ — x 2 /2]. 

(b) The normalized simple harmonic oscillator wave function is 

4>.(x) = ( w W2"»!)-‘/ 2 exp[-xV2]//„(4 

Show that this may be written as 

iM*) = (n ll2 2"n\)~ 112 ^x - exp[-x 2 /2]. 

Note. This corresponds to an n-fold application of the raising operator of 
Exercise 13.1.16. 


13.1.18 (a) Show that the simple oscillator Hamiltonian (from Eq. 13.18) may be 
written as 

d-i 2 1 
2* —2 ( 


l 


H = ~2 dx 2 


+ ^x 2 = ^(aa f + a' a). 


Hint. Express E in units of hco. 

(b) Using the creation — annihilation operator formulation of part (a) — show 
that 


Hi/,(x) = (n + ±W(x). 

This means the energy eigenvalues are E = (« + in agreement with 

Eq. 13.20. 


13.1.19 Write a program that will generate the coefficients a 5 in the polynomial form 
of the Hermite polynomial, H„(x ) = 2? =0 a 5 x s . 


1 3.1 .20 A function f(x) is expanded in an Hermite series: 

fix) = I a„H n (x). 

n = 0 

From the orthogonality and normalization of the Hermite polynomials the 
coefficient a n is given by 

a " = 2d^r r 

For f{x) = x 8 determine the Hermite coefficients a n by the Gauss-Hermite 
quadrature (Appendix 2). Check your coefficients against AMS-55, Table 22.12. 
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13 . 1 .21 (a) In analogy with Exercise 12.213 set up the matrix of even Hermite poly- 

nomial coefficients that will transform an even Hermite series into an even 
power series : 



Extend B to handle an even polynomial series through H s (x). 

(b) Invert your matrix to obtain matrix A which will transform an even power 
series (through x 8 ) into a series of even Hermite polynomials. Check the 
elements of A against those listed in AMS-55 (Table 22.12). 

(c) Finally, using matrix multiplication, determine the Hermite series equiva- 
lent to /(x) = x 8 . 


13 . 1 .22 Write a subroutine that will transform a finite power series £^ =0 fl w x" into an 
Hermite series b„H„(x). Use the recurrence relation Eq. 13.2 and follow 
the technique outlined in Section 13.4 for a Chebyshev series. 

Note. Both Exercises 13.1.21 and 13.1.22 are faster and more accurate than the 
Gaussian quadrature, Exercise 13.1.20, if /(x) is available as a power series. 

13 . 1.23 Write a subroutine for evaluating Hermite polynomial matrix elements of the 
form 




"00 

H p (x)H g (x)x r e' x2 dx, 

J - 00 


using the 10-point Gauss-Hermite quadrature (for p 4- q + r < 19). Include a 
parity check and set equal to zero the integrals with odd parity integrand. Also, 
check to see if r is in the range \p - q\<r <p + q. Otherwise M pqr = 0. Check 
your results against the specific cases listed in Exercises 13.1.1 1, 13.1.12, 13.1.13 
and 13.1.14. 


1 3 . 1 .24 Calculate and tabulate the normalized linear oscillator wave functions 
«A„(x) = 2 _ " /2 7r _1/4 (n!)~ 1/2 H„(x)exp( — x 2 /2) for x — 0.0(0. 1)5.0 
and n = 0(1)5. If a plotting routine is available, plot your results. 


13.2 LAGUERRE FUNCTIONS 

Differential Equation — Laguerre Polynomials 

If we start with the appropriate generating function, it is possible to develop 
the Laguerre polynomials in exact analogy with the Hermite polynomials. 
Alternatively, a series solution may be developed by the methods of Section 8.5. 
Instead, to illustrate a different. technique, let us start with Laguerre’s differential 
equation and obtain a solution in the form of a contour integral, as we did with 
the modified Bessel function K v (x) (Section 11.6). From this integral representa- 
tion a generating function will be derived. 

Laguerre’s differential equation is 

xy"(x) + (1 - x)y'(x) + ny(x) = 0. 


(13.24) 
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y 



FIG. 13.3 Laguerre function contour 


We shall attempt to represent y , or rather y n , since y will depend on by the 
contour integral 

1 f p-xzl(l-z) 

(13 ' 25a, 

The contour includes the origin but does not enclose the point z = 1. From 
Section 6.4 


MX) - 


y:(x) = 


1 


2ni J (1 — zyz 


e ~xzj(l-z) 

(1 - z)V 

e -xz/(l-z) 

,3 -M-1 


dz , 

( 13.25ft) 

dz. 

(13.25c) 


Substituting into the left-hand side of Eq. 13.24, we obtain 

1~ X • n Ip -xzl(\-z)J 


Sift 


(1 - z) 3 z B_1 (1 - z) 2 z B + (1 - z)z B+1 _f 


which is equal to 


i r j 

2ni J dz (1 — z)z"_ 


dz. 


(13.26) 


If we integratoour perfect differential around a contour chosen so that the final 
value equals the initial value (Fig. 13.3), the integral will vanish, thus verifying 
that y„(x) (Eq. 13.25a) is a solution of Laguerre’s equation. 

It has become customary to define L„(x), the Laguerre polynomial (Fig. 13.4), 

by 1 


L„(x) = 


1 C e ~ xz/(1 ~ z) 

2ni °(1 -z)z B+1 


dz. 


(13.27) 


1 Other definitions of L n (x) are in use. The definitions here of the Laguerre 

polynomial £„(*) and the associated Laguerre polynomial L k n (x) agree with 

AMS-55 (Chapter 22). 
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This is exactly what we would obtain from the series 

p XZ({ 1 Z ) go 

g(x,z) = — = £ L„(x)z n , |z| < 1 (13.28) 

1 — z n ~ 0 

if we multiplied by z~ n ~ x and integrated around the origin. As in the development 
of the calculus of residues (Section 7.2), only the z _1 term in the series survives. 
On this basis we identify g{x,z) as the generating function for the Laguerre 
polynomials. 

With the transformation 


xz s — X 

= 5 — x or z = 

1 — Z 5 

(13.29) 

p x C ? n p~ s 

lm ~2 »T( S -xr*'*- 

(13.30) 


the new contour enclosing the point s — x in the s-plane. By Cauchy’s integral 
formula (for derivatives) 

L n {x) = ~^~ n {x n e~ x l (integral n\ (13.31) 

n ! ax 

giving Rodrigues’ formula for Laguerre polynomials. From these representa- 
tions of L n (x) we find the series form (for integral n): 


(13.32) 

V' 1 / i yi ~~s n :A 

(n — s)\(n — s)\s\ 

and the specific polynomials listed in Table 13.2 (Exercise 13.2.1). 

By differentiating the generating function in Eq. 13.28, with respect to x and 
z, we obtain the recurrence relations 


= i (-ir 


n ! 


(n — m) ! m ! m ! 


-x = 


L - M = 4fp'"TT x- " 


+ 


\n- 1) 2 x „_ 2 


2! 
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TABLE 13.2 Laguerre Polynomials 


L 0 (x) = 1 
Z.,(x) = — x + 1 
2!L 2 (x) = x 2 — 4x + 2 
3 !L 3 (x) = — x 3 + 9x 2 — 18x + 6 
4!L 4 (x) = x 4 - 16x 3 + 72x 2 - 96x + 24 
5!Lj(x) = -x 5 + 25x 4 - 200x 3 + 600x 2 - 600x + 120 
6!L 6 (x) = x 6 - 36x 5 + 450x 4 - 2400x 3 + 5400x 2 - 4320x + 720 


(n + l)L„ +1 (x) = (2n + 1 - x)L„(x) - nL^^x), 

xL'„(x) = nL„(x) - nL„_ l (x). 

Equation 13.33, modified to read 

i.tiW = 2 L„(x) - L„_ t (x) 

- [(1 + x)L n (x) - L n _!(x)]/(n + 1), 

for reasons of economy and numerical stability, is used for machine computa- 
tion of numerical values of L„(x). The computing machine starts with known 
numerical values of L 0 (x) and Lj(x), Table 13.2, and works up step by step — in 
milliseconds. This is the same technique discussed for computing Legendre 
polynomials, Section 12.2. 

Also, from Eq. 13.28 we find the special value 

L„(0) = 1. (13.35) 

As may be seen from the form of the generating function, the form of Laguerre’s 
differential equation, or from Table 13.2, the Laguerre polynomials have neither 
odd nor even symmetry (parity). 

The Laguerre differential equation is not self-adjoint and the Laguerre poly- 
nomials, L„(x), do not by themselves form an orthogonal set. However, follow- 
ing the method of Section 9.1, we may multiply Eq. 13.24 by e~ x (Exercise 9.1.1) 
and obtain 


(13.33) 

(13.34) 

(13.33a) 


f* CO 

e~ x L m (x)L n (x)dx = <5 m> „. (13.36) 

Jo 

This orthogonality is a consequence of the Sturm-Liouville theory, Section 9.1. 
The normalization follows from the generating function. It is sometimes con- 
venient to define orthogonalized Laguerre functions (with unit weighting func- 
tion) by 

<P n (x ) = e~ xl2 L„(x) (13.37) 

Our new orthonormal function cp n (x) satisfies the differential equation 

X(Pn(x) + (p’Jx) + (n + i - ^<p„(x) = 


(13.38) 



LAGUERRE FUNCTIONS 725 


which is seen to have the Sturm-Liouville form (self-adjoint). Note that it is 
the boundary conditions in the Sturm-Liouville theory that fix our interval as 
(0 < x < oo ). 

Associated Laguerre Polynomials 

In many applications, particularly in quantum theory, we need the associated 
Laguerre polynomials defined by 2 

L k n (x) = (-l) k £- k [L n+k (x)l (13.39) 


From the series form of L n (x) 

L k 0 (x) = 1 

L\(x) = — x + k -f- 1 
L^(x) = — — (k -f- 2)x + 


(k + 2)(k + 1) 


(13.40) 


In general, 


*£(*)= I (— ir 


(u + fe) ! 


(n — m ) ! ( k + m) ! m ! 


k> -1. 


(13.41) 


A generating function may be developed by differentiating the Laguerre 
generating function k times. Adjusting the index to L n+k , we obtain 

p-xzlil-z) 00 

7j ITk+T = I |z| < 1. (13.42) 

From this 

li(0) = (13.43) 

n Ik ! 


Recurrence relations can easily be derived from the generating function or by 
differentiating the Laguerre polynomial recurrence relations. Among the nu- 
merous possibilities are 

(n + l)L* +1 (x) = (2n + k + 1 — x)L*(x) — (n + k)L k ^ { (x) (13.44) 

xL k n '(x) = nh\(x ) - (« + k)LU(x). (13.45) 

From these or from differentiating Laguerre’s differential equation k times we 
have the associated Laguerre equation 

xLf(x) + (k H- 1 - x)L*'(x) + nLj(x) = 0. (13.46) 

When associated Laguerre polynomials appear in a physical problem it is 
usually because that physical problem involves Eq. 13.46. 


2 Some authors use J?* +k (x) — {d k jdx k )[L n+k (xy\. Hence our 

!£(*) = (- i)^ n \ k «. 
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A Rodrigues representation of the associated Laguerre polynomial is 




e x x k 
nT 


d n 

dx n 


(e~ x x n+k ). 


(13.47) 


The reader will note that all these formulas for L*(x) reduce to the corresponding 
expressions for L n (x ) when k = 0. 

The associated Laguerre equation (13.46) is not self-adjoint but it can be put 
in self-adjoint form by multiplying by e~ x x k , which becomes the weighting func- 
tion (Section 9.1). We obtain 

f e~ x x k L k (x)L k m (x)dx = (13.48) 

Jo n * 


Equation 13.48 shows the same orthogonality interval (0, oo) as that for the 
Laguerre polynomials, but with a new weighting function we have a new set of 
orthogonal polynomials, the associated Laguerre polynomials. 

By letting ij, k (x) = e~ x,2 x k!2 L k {x\ \j/ k (x) satisfies the self-adjoint equation 


xrt"(x) + ti'(x) + ^ J + 2n - + --— - !£j tf{x) = 0. (1 3.49) 

The i// k (x) are sometimes called Laguerre functions. Equation 13.36 is the special 
case k = 0. 

A further useful form is given by defining 3 


<I>*(x) = e x/2 x (k+iy2 L k (x). 

Substitution into the associated Laguerre equation yields 


®T<*>+ -7 + 


1 , 2n + H 1 k 2 — 1 


4 2x 4x 2 

The corresponding normalization integral is 


®ft*) = o. 


I” 


e X x k+1 L k (x)L k (x)dx = ^-~j^(2n -f k + 1). 


(13.50) 

(13.51) 


(13.52) 


The reader may show that the 0> k (x) do not form an orthogonal set (except with 
x _1 as a weighting function) because of the x" 1 in the term (2n + k + l)/2x. 

The Laguerre functions L£(x) in which the indices v and \x are not integers may 
be defined using the confluent hypergeometric functions of Section 13.6. 


EXAMPLE 13.2.1 The Hydrogen Atom 


Perhaps the most important single application of the Laguerre polynomials 
is in the solution of the Schrodinger wave equation for the hydrogen atom. This 
equation is 


3 This corresponds to modifying the function ^ in Eq. 13.49 to eliminate the 
first derivative (compare Exercise 8.6.1 1). 
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2m 


V 2 i> 


Ze 2 


i // = E\jf, 


(13.53) 


in which Z = 1 for hydrogen, 2 for singly ionized helium, and so on. Separating 
variables, we find that the angular dependence of i// is Y^(0, <p). The radial part,. 
R{r), satisfies the equation 


_ PL 1 2 i faM) - „ + *L fjL ± i) K = ER 

2m r dr \ dr J r 2m r 

By use of the abbreviations 

p = ccr with a 2 = E < 0, 

p h 2 

. _ 2 mZe 2 


(13.54) 


(13.55) 


Eq. 13.54 becomes 


J_ d_ 
p 2 dp 



+ 


1 

4 


UL + l) \ 
P 2 ) 


x(p) = o, 


(13.56) 


where %(p) = R(p/ a). A comparison with Eq. 13.51 for <I>‘(x) shows that Eq. 13.56 
is satisfied by 


PX(p) = e p/2 p L+1 LfLiiiip), (13.57) 

in which k is replaced by 2L + 1 and n by X — L — 1. 

We must restrict the parameter X by requiring it to be an integer n, n — 
1, 2, 3, ... . 4 This is necessary because the Laguerre function of nonintegral n 
would diverge as p n e p , which is unacceptable for our physical problem in which 


lim R(r ) = 0. 

r-+o o 


This restriction on X , imposed by our boundary condition, has the effect of 
quantizing the energy 


Z 2 me 4 
2 n 2 h 2 ' 


(13.58) 


The negative sign enters because we are dealing here with bound states, E = 0, 
corresponding to an electron that is just able to escape to infinity. Using this 
result for E n , we have 


a = 2 me 2 Z = 2Z 
h 2 n na 0 

2 Z 

P = r 

na 0 


(13.59) 


4 This is the conventional notation for X. It is not the same n as the index n 
in <X*x). 
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with 


a o 


h 2 

— y the Bohr radius. 
me 


The final normalized hydrogen wave function may be written as 


i KlmW’V) 


2Z V (n — L — 1)! 
na 0 J 2 n{n + L) ! 


1/2 


x e " /2 (ar) L L^i(ar)yf(0,<p). 

(13.60) 


EXERCISES 


13 . 2.1 Show with the aid of the Leibnitz formula, that the series expansion of L„(x) 
(Eq. 13.32) follows from the Rodrigues representation (Eq. 13.31). 

1 3 . 2.2 (a) Using the explicit series form (Eq. 13.32) show that 

L;(0) = -n 
L1'(0) = in(n - 1). 

(b) Repeat without using the explicit series form of L n (x). 


1 3 . 2.3 From the generating function derive the Rodrigues representation 

r — k *“ 

ex 


Ln(x) = ' 


n\ dx n 


(e 


-x v n+* 


)• 


13 . 2.4 Derive the normalization relation (Eq. 13.48) for the associated Laguerre 
polynomials. 


13 . 2.5 


Expand x r in a series of associated Laguerre polynomials L k (x\ k fixed and n 
ranging from 0 to r (or to oo if r is not an integer). 

Hint. The Rodrigues form of L k n (x) will be useful. 


ANS. 


x r ~ (r + k)\r\ £ 

n = 0 


(-ir^(x) 

(n -h k)\(r - «)!' 


0 <, x < oo. 


1 3 . 2.6 Expand e' ax in a series of associated Laguerre polynomials L k (x), k fixed and n 
ranging from 0 to oo. 

(a) Evaluate directly the coefficients in your assumed expansion. 

(b) Develop the desired expansion from the generating function. 

ANS 0s * <co 


13 . 2.7 


13 . 2.8 


Show that 

f e -- x ^L k n (x)L k n (x)dx = (fl + fc)! (2« + k + 1). 

Jo 

Hint . Note that 

xL k n = (2n + k + 1)£,* - (n + k)^ - (n + 1)L; +1 . 

Assume that a particular problem in quantum mechanics has led to the dif- 
ferential equation 
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d 2 y r /c 2 — 1 2n + k + 1 ll 
dx 2 [_ 4x 2 2x 4J 

Write y(x) as 

y(x) = A(x)B(x)C(x) 

with the requirement that 

(a) A{x) be a negative exponential giving the required asymptotic behavior of 
y(x) and 

(b) 5(x) be a positive power of x giving the behavior of y(x) for x 1. 
Determine zl(x) and B(x). Find the relation between C(x) and the associated 
Laguerre polynomial. 

ANS. A(x) = e~ xl2 

B(x) = x (k+1)/2 , 
C(x) = L k n (x). 


1 3 . 2.9 From Eq. 13.60 the normalized radial part of the hydrogenic wave function is 


R nL (r) = 


5 (n — L — 1)! 


1/2 




2n(n + L) ! 

in which a = 2 Z/na 0 = 2 Zme 2 /nh 2 . Evaluate 


(a) <r> 


' % oo 

Jo 


rR„ L {<*r)Ki.(<x-r)r 2 dr, (b) (r 


f*cc 


K„/.(ar)K„ L (ar)r 2 dr. 


The quantity <r) is the average displacement of the electron from the nucleus, 
whereas <r _1 ) is the average of the reciprocal displacement. 

ANS. <r> = ^ [3 n 2 - L(L + 1)] 

<r-‘> = -r- 


13 . 2.10 Derive the recurrence relation for the hydrogen wave function expectation 
values. 


^<r s+1 > - (2s + 3)a 0 <r s > + ^_1[(2L + l) 2 - (s + l) 2 K<r s -‘> = 0 
with 5 > — 2L — 1. <r s > = r s . 

Hint. Transform Eq. 13.56 into a form analogous to Eq. 13.51. Multiply by 
p s+2 u' — cp s+1 u. Here u = p®. Adjust c to cancel terms that do not yield 
expectation values. 


13 . 2.11 


The hydrogen wave functions, Eq. 13.60, are mutually orthogonal as they should 
be, since they are eigenfunctions of the self-adjoint Schrodinger equation. 




X l / n l L 1 M 1 l l / n 2 L 2 M 2 r2 — &n y nJ>L 


Yet the radial integral hasThe (misleading) form 


J e~ arl2 (ar) L L 2 yi_ l (ar)e~ , " ,2 (ar) L L 2 y'l^ 1 (ar) 2 dr, 

which appears to match Eq. 13.52 and not the associated Laguerre orthogonality 
relation, Eq. 13.48. How do you resolve this paradox? 

ANS. The parameter a is dependent on n. The first three a’s previously 
shown are 2 Z/n 1 a 0 . The last three are 2 Z/n 2 a 0 . For n A = n 2 
Eq. 13.52 applies. For n x =£ n 2 neither Eq. 13.48 nor Eq. 13.52 
is applicable. 
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13.2.1 2 A quantum mechanical analysis of the Stark effect (parabolic coordinate) leads 
to the differential equation 

Here F is a measure of the perturbation energy introduced by an external electric 
field. Find the unperturbed wave functions ( F — 0) in terms of associated 
Laguerre polynomials. 

ANS. u({) = e-^ 2 ^ ml2 L m p (ei\ with £ = > 0, p = a/e - (m + l)/2, 

a nonnegative integer. 


13.2.13 The wave equation for the three-dimensional harmonic oscillator is 

+ iMcoVtff = E\j/. 


Here co is the angular frequency of the corresponding classical oscillator. Show 
that the radial part of \j/ (in spherical polar coordinates) may be written in terms 
of associated Laguerre functions of argument (/?r 2 ), where /? = Af co/ft. 

Hint. As in Exercise 13.2.8, split off radial factors of r l and e~ pr2f2 . The associated 
Laguerre function will have the form L^Yn-j-i^/fr 2 ). 


13.2.14 Write a program that will generate the coefficients a s in the polynomial form 
of the Laguerre polynomial, L n (x) = 2? =o 0 s x s . 

1 3.2.1 5 (a) Write a subroutine that will transform a finite power series a n x n into 

a Laguerre series £* =0 h„L„(x). Use the recurrence relation Eq. 13.33 and 
follow the technique outlined in Section 13.4 for a Chebyshev series. 


13.2.16 Tabulate L 10 (x) for x = 0.0(0.1)30.0. This will include the 10 roots of L l( >. 
Beyond x = 30.0, L 10 (x) is monotonic increasing. If a plotting subroutine is 
available, plot your results. 

Check value. Eighth root = 16.279. 

13.2.17 Determine the 10 roots of L 10 (x) using a root-finding subroutine (compare 
Appendix 1). You may use your knowledge of the approximate location of the 
roots or develop a search routine to look for the roots. The 10 roots of L 10 (x) 
are the evaluation points for the 10-point Gauss- Laguerre quadrature (compare 
Appendix 2). Check your values by comparing with AMS-55 (Table 25.9). 

13.2.18 Calculate the coefficients of a Laguerre series expansion (L n (x\k = 0) of the 
exponential e~ x . Evaluate the coefficients by the Gauss-Laguerre quadrature 
(compare Eq. 9.^4). Check your results against the values given in Exercise 
13.2.6. 

Note. Direct application of the Gauss-Laguerre quadrature with f(x) = 
L n (x)e~ x gives poor accuracy because of the extra e~ x . Try a change of variable, 
y = 2x, so that the function appearing in the integrand will be simply L n (y/ 2). 

1 3.2.1 9 (a) Write a subroutine to calculate the Laguerre matrix elements 

M m „ p = ^ L m (x)L„(x)x>e-*dx. 

Include a check that the condition Jm — n| <, p < m + n. (If p is outside 
this range, M mnp = 0: Why?) 

Note. A 10-point Gauss-Laguerre quadrature will give accurate results 
for m -f n + p < 19. 
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(b) Call your subroutine to calculate a variety of Laguerre matrix elements. 

Check M nnl against Exercise 13.2.7. 

1 3 . 2.20 Write a subroutine to calculate the numerical value of L*(x) for specified values 
of n, k , and x. Require that n and k be nonnegative integers and x be > 0. 

Hint . Starting with known values of L k 0 and L\(x), we may use the recurrence 
relation, Eq. 13.44, to generate L*(x), n = 2, 3, 4, .... 

13 . 2.21 Write a program to calculate the normalized hydrogen radial wave function 
\jf nL {r). This is i j/ nLM of Eq. 13.60, omitting the spherical harmonic Y^(0,(p). 
Take Z — 1 and a 0 — 1. (which means that r is being expressed in units of Bohr 
radii). Accept n and L as input data. Tabulate i/v(r) for r = 0.0(0.2)R with R 
taken large enough to exhibit the significant features of i j/. This means roughly 
R = 5 for n = 1, 10 for n ~ 2, and 30 for n = 3. 


13.3 CHEBYSHEV (TSCHEBYSCHEFF) 

POLYNOMIALS 

In this section two types of Chebyshev polynomials are developed as special 
cases of ultraspherical polynomials. Their properties follow from the ultra- 
spherical polynomial generating function. The primary importance of the 
Chebyshev polynomials is in numerical analysis. Section 1 3.4 is devoted to these 
numerical applications. 

Generating Functions 

In Section 12.1 the generating function for the ultraspherical or Gegenbauer 
polynomials 


(1 -2,‘ + ^ -| 0 c " w Wo- I'K' " 3 - 61 » 

was mentioned, with a = \ giving rise to the Legendre polynomials. In this 
section we first take a = 1 and then a = 0 to generate two sets of polynomials 
known as the Chebyshev polynomials. 

Type II 

With a = 1 and C^ 1} (x) = C w (x), Eq. 13.61 gives 

a _ L + .*> ° .1 w < i. m<«- < i3 «) 

These functions, U n {x\ generated by (1 — 2xt + t 2 ) -1 are labeled Chebyshev 
polynomials type II. Although these polynomials have few applications in 
mathematical physics, one unusual application is in the development of four- 
dimensional spherical harmonics used in angular momentum theory. 

Type I 

With a = 0 there is difficulty. Indeed, our generating function reduces to the 
constant 1. We may avoid this problem by first differentiating Eq. 13.61 with 
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respect to t. This yields 


-a(-2x + 2t) A cW( ) „-i 

(1-2 xt + t 2 f +i h n 

(13.63) 

X~t _ y «r C n a) Wl f n-l 

(1 — 2xt + f 2 )“ +1 2[_ cc J ' 

(13.64) 


We define C,J 0) (x) by 

Ci°>(x) = lim®^. (13.65) 

<x-o a 


The purpose of differentiating with respect to t was to get an alpha in the 
denominator and to create an indeterminant form. Now multiplying Eq. 13.64 
by 2 1 and adding 1 = (1 — 2 xt + t 2 )/( 1 — 2 xt + t 2 \ we obtain 


1 - t 2 

1 — 2 xt + t 2 


= 1+2 £ ^C< 0) Mr". 

n-1 L 


(13.66) 


We define T„{x) by 


1 , 



w = 0 


n > 0 . 


(13.67) 


Notice the special treatment for n = 0. This is similar to the treatment of n = 0 
term in the Fourier series. Also, note carefully that is the limit indicated in 
Eq. 13.65 and not a literal substitution of a = 0 into the generating function 
series. With these new labels, 

1 - 2xt + ? = ToW + 2 | |x| - 1 ’* M < L (13 ' 68) 

We call T n (x ) the Chebyshev polynomials, type I. The reader should be warned 
that the notation for these functions differs from reference to reference. There is 
almost no general agreement. Here we follow the usage of AMS-55. 

These Chebyshev polynomials (type I), which combine useful features of (1) 
the Fourier series and (2) orthogonal polynomials, are of great interest in 
numerical computation. For example, a least-squares approximation minimizes 
the average squared error. An approximation using Chebyshev polynomials 
allows a larger average squared error but may keep extreme errors down, 
Section 13.4. 

Differentiating the generating function (Eqs. 13.62 and 13.68) with respect to 
t , we obtain recurrence relations 


T n+1 (x) - 2xT n (x) + T n ^(x) = 0, 
U n+1 (x) - 2xU n (x) + V m - X (x) = 0 


(see Table 13.3). 


(13.69) 

(13.70) 
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TABLE 13.3 Orthogonal Polynomial 
Recurrence Relation" 


f,+iW = <A„x + B n )P n (x) - C„P„^(x) 



P„(x) 


B n 

C„ 

Legendre 

PM 

2 n+ 1 
n -h 1 

0 

1 

n+ 1 

Chebyshev I 

T„(x) 

2 

0 

1 

Shifted Chebyshev I 

T*(x) 

4 

-2 

1 

Chebyshev II 

U„(x) 

2 

0 

1 

Shifted Chebyshev II 

U„*(x) 

4 

-2 

1 

Laguerre Associated 
Laguerre 

Lf{x) 

1 

n -f 1 

2 n + k -+■ 1 

1 

n + k 

n 4- 1 

Hermite 

UJx) 

2 

0 

2 n 


°P„ is any orthogonal polynomial. 


TABLE 13.4 

Chebyshev 
Polynomials, 
Type 1 


r 0 = i 

r, = x 

r 2 = ix 1 - l 

T 3 = 4x 3 - 3* 

r 4 = 8x- 4 - 8 a- 2 + 1 

r 5 = 16x 5 - 20 a 3 + 5a 

T b = 32a 6 - 48a 4 + 18a 2 - 1 


TABLE 13.5 

Chebyshev 
Polynomials, 
Type II 


U 0 = I 

U t = 2 a 

U 2 = 4a 2 - 1 

U 3 = 8a 3 - 4a 

U A = 16a 4 - 12a 2 + 1 

U s = 32a 5 - 32a 3 + 6a 

U 6 = 64a 6 - 80a 4 + 24a 2 - 1 
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Then, using the generating functions for the first few values of n and these 
recurrence relations for the higher-order polynomials, we get Tables 13.4 and 
13.5 (see also Figs. 13.5 and 13.6). 

As with the Hermite polynomials, Section 13.1, the recurrence relations, Eqs. 
13.60 and 13.70, together with the known values of T 0 (x), 7\(x), l/ 0 (x), and 
t/i(x), provide a convenient — that is, for a high-speed electronic computer — 
means of getting the numerical value of any T n (x 0 ) or U n (x 0 \ with x 0 a given 
number. Again, from the generating functions, we have the special values 
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T n { 1) = 1 
r„(-i) = (-ir 
r 2 „(0) = (-i)" 
F 2 „ +1 (0) = 0 
u„( 1) = n+ 1 
t/„(- 1) = (-!)> + 1) 


t/ 2 „(0) = (-l)" 
U 2n+i (0 ) = 0 . 

The parity relations for T„ and U„ are 

T„{x) = ( — 1)" T„( — x) 

u n ( X ) = (-iru„(-x). 

Rodrigues representations of T n (x ) and U„(x ) are 
(-1)' , 7i 1/2 (1 -x 2 ) 1/2 


TJx) = 


2"(n 


2) ! 


dx 


-[d-.x 2 m 


and 


UJx) = 


(—!)”(» + 1/2 ilrn - v2i"+>/ 2 i 

2 n+1 (n + j) !(1 — x 2 ) 1/2 dx" ^ 1 ’ J 


(13.71) 


(13.72) 


(13.73) 

(13.74) 


(13.75) 


(13.76) 


Recurrence Relations-Derivatives 

From the generating functions for T n (x) and U n (x) differentiation with respect 
to x leads to a variety of recurrence relations involving derivatives. Among the 
more useful equations are 

(1 - x 2 )T'(x) = —nxTJx) + «T„.,(x), (13.77) 

and 

(1 - x 2 )u;(x) = - nxUJx ) + (n + l)l/„_,(*). (13.78) 

From Eqs. 13.69 and 13.77 T tJ (x) the Chebyshev polynomial type 1 satisfies 

(1 - x 2 )T„"(x) - xT„'(x) + n 2 T„(x) = 0. (13.79) 

U n (x ) the Chebyshev polynomial .of type II satisfies 

(1 - x 2 )U;'(x) - 3 xU;{x) + n(n + 2 )U n (x) = 0. (13.80) 

The ultraspherical equation 

(1 - x^-^CJJx) - (2a + l)x£C(x) + n(n + 2a)Cj\x) = 0 (13.81) 

is a generalization of these differential equations, reducing to Eq. 13.79 for 
a = 0 and Eq. 13.80 for a = 1 (and to Legendre’s equation for a = j). 
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Trigonometric Form 

At this point in the development of the properties of the Chebyshev solutions 
it is beneficial to change variables replacing x by cos 6. With x = cos 6 and 
d/dx = ( — 1/sin 0)(d/dQ\ Equation 13.79 becomes 

^ + n 2 r„ = 0, (13.82) 

the simple harmonic oscillator equation with solutions cos nd and sin nd. The 
special values (boundary conditions) identify 

T„ = cos n6 = cos n(arc cos x). (13.83a) 

A second linearly independent solution of Eqs. 13.79 and 13.82 is labeled 

V„ = sin nd = sin n(arccosx). ( 1 3.83b) 

The solutions of the type II Chebyshev equation, Eq. 13.80, become 

sin(n + 1)0 


U = 


sir id 


yy _ COS(H + 1)0 

" sin 0 

The two sets of solutions, type I and type II are related by 

K(x) = (1 - * 2 ) 1/2 C/ n - 1 (x) 

W„{x) = (1 — x 2 ) -1/2 T„ +1 (x). 


(13.84a) 

(13.84h) 

(13.85a) 

(13.85b) 


As already seen from generating functions, T„(x) and L' n (x) are polynomials. 
Clearly, V„(x) and W n {x) are not polynomials. 

From 


T„(x) + iV„(x) = cos nG + i sin n0 
= (cos 0 + i sin 0f 
= [x + i(l — x 2 )]", |x| < 1 

we obtain expansions 


(13.86) 


T„(X) = x" - ( )x" 2 (1 - x 2 ) + ( lx" 4 (1 - x 2 ) 2 - • • • , (13.87a) 


and 

F„(x) = V 1 ~ * 2 




(13.876) 


Here the binomial coefficient (”) is given by 


nl 


ml m\(n - m)\ 
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From the generating functions, or from the differential equations, power-series 
representations are 


and 


in/2 ] 

TJx) = Z X (-1)' 


(n — m — 1)! 
m \(n — 2m)! 


(2x) 


n~2m 


(13.88a) 


UJx) = 


In/ 2] 


I (-1)' 

m = 0 


(n — m) ! 
m!(n — 2m)! 


(2x) M_2m . 


(13.886) 


Orthogonality 

If Eq. 13.79 is put into self-adjoint form (Section 9.1), we obtain w(x) = 
(1 — x 2 )~ 1/2 as a weighting factor. For Eq. 13.81 the corresponding weighting 
factor is (1 — x 2 ) +1/2 . The resulting orthogonality integrals are 


TJx)TJx)(\ - x 2 ) 1/2 dx 

-1 


0, 

71 

< — , 

2’ 

n. 


m ^ u, 
m — n ^ 0, 
m — n — 0, 


F m (x)F n (x)(l - x 2 ) 1/2 dx = ^ 


0, 

71 

r 

o, 


1 C/ m (x)17 n (x)(l — x 2 ) 112 dx — ~.S mn , 

J-l ^ 


mj*n, 

m = « 0 , 

m = n = 0, 


(13.89) 


(13.90) 


(13.91) 


and 


1 W)„(x)^(x)(l - x 2 ) 1 ' 2 dx = (13.92) 

J-l ^ 


This orthogonality is a direct consequence of the Sturm-Liouville theory, 
Chapter 9. The normalization values may best be obtained by using x = cos 0 
and converting these four integrals into Fourier normalization integrals (for the 
half integral [0, 7c]). 


EXERCISES 


1 3.3.1 Another Chebyshev generating function is 

1 — xt 


1 — 2 xt -f t : 


■ = I *„(x)f", 


Iflcl. 


How is X„(x) related to T n (x) and U„(x)l 


13.3.2 Given 
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(1 - x 2 )U:(x) - 3xU„'(x) + n(n + 2 )U„(x) = 0, 
show that V„(x) satisfies 

(1 - x 2 )K„"(x) - xVJ(x) + n 2 V n (x) = 0, 
which is Chebyshev’s equation. 


1 3.3.3 Show that the Wronskian of T„(x ) and V n (x ) is given by 

TJx)K(x) - T„'(x)K(x) = - _” i)1/2 . 

This verifies that 7; and T„(h ^ 0) are independent solutions of Eq. 13.79. 
Conversely, for n = 0, we do not have linear independence. What happens at 
w = 0? Where is the “second” solution? 

1 3.3.4 Show that W n (x) = (1 — x 2 )~ V2 T n+1 (x) is a solution of 

(1 - x 2 )W”(x) - 3 xW;(x) + n(n + 2)W„(x) = 0 

1 3.3.5 Evaluate the Wronskian of U n (x ) and W^x) = (1 — x 2 )~ 1/2 7^ +1 (x). 

13.3.6 V n (x) =* (1 - x 2 ) 1/2 C n _ 1 (x) is not defined for n = 0. Show that a second and 
independent solution of the Chebyshev differential equation for X n (x), (n = 0) 
is V 0 (x) = arc cos x (or arc sin x). 

13.3.7 Show that V n (x) satisfies the same three-term recurrence relation as T n (x ) 
(Eq. 13.69). 


13.3.8 

13.3.9 


Verify the series solutions for T„(x) and U n (x) (Eqs. 13.88a and 13.88b). 
Transform the series form of 7^(x), Eq. 13.88a, into an ascending power series. 

(n -I- m)\ 


m = 0 

T 2M+1 (x) = (~ir 


(n - m) !(2m) ! 
2n + 1 


X (-1)" 


(n - m)!(2m + 1)! 


(2x) 2 


1 3.3.1 0 Rewrite the series form of U„(x), Eq. 1 3.88b, as an ascending power series. 

ans. 

= (-!)" X (-ir r ^ - + ~ y 111— (2x) 2 " 
m=o (»-m)!(2m+ 1)! 


1 3.3.1 1 Derive the Rodrigues representation of 7^(x). 

(-1)V /2 (1 -x 1 ) 112 d " 




2"(n — i)! dx 


-[d-x 2 rn 


Hints. One possibility is to use the hypergeometric function relation 

1 F l (a,b,c;z) = ( l-r)Vi 

with z = (1 — x)/2. An alternate approach is to develop a first-order differential 
equation for y = (1 — x 2 ) n_1/2 . Repeated differentiation of this equation leads 
to the Chebyshev equation. 



1 3.3.1 2 (a) From the differential equation for T„ (in self-adjoint form) show that 



EXERCISES 739 


1 dT m (x)dT n (x) 
_ 1 dx dx 


(1 - x 2 ) 1,2 dx = 0, 


(b) Confirm the preceding result by showing that 


dJM 

dx 


n U„-i(x). 


m=^ n. 


1 3.3.1 3 The expansion of a power of x in a Chebyshev series leads to the integral 

Ln = f ^UX) 

J-i Vi -* 2 

(a) Show that this integral vanishes for m < n. 

(b) Show that this integral vanishes for m 4- n odd. 


1 3.3.1 4 Evaluate the integral 

Ln = f X"'7„(X) 

J-1 VI - * 
for m>n and m + n even by each of two methods: 

(a) Operate with x as the variable replacing T n by its Rodrigues representation. 

(b) Using x = cos 6 , transform the integral to a form with 0 as the variable. 

. m! (m — n — 1)! ! 

AJMS. l mn = 7i , m > n, m + n even. 

(m — n)\ (m + n) ! ! 


13.3.15 


Establish the following bounds, — 1 < x < 1 : 
(a) |l/„(x)| < n + 1, 


(b) 


T T " (x) 

dx 


< n z 


1 3.3.1 6 (a) Establish the following bound, — 1 < x < 1 : 


K(x) = l. 

(b) Show that W n (x) is unbounded in — 1 < x < 1. 


1 3.3.1 7 Verify the orthogonality-normalization integrals for 

(a) TJx), TJx) 

(b) VJx), VJx) 

(c) UJx\ UJx ) 

(d) WJx\ W n (x). 

Hint. All these can be converted to Fourier orthogonality-normalization 
integrals. 


1 3.3.1 8 Show whether 

(a) TJx ) and VJx) are or are not orthogonal over the interval [ — 1, 1] with 
respect to the weighting factor (1 — x 2 )~ 1/2 . 

(b) UJx) and WJx) are or are not orthogonal over the interval [ — 1,1] with 
respect to the weighting factor (1 — x 2 ) 1/2 . 

13.3.19 Derive 

(a) T n+1 (x) + T n _Jx) = 2 xT„(x), 

(t>) T m+ „(x) + T m _Jx) = 2TJx)TJx), 
from the “corresponding” cosine identities. 


13.3.20 A number of equations relate the two types of Chebyshev polynomials. As 
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examples show that 

TJx) = U„(x) -xU'-Ax) 

and 


(1 - x 2 )l/„(x) = xT„ +i (x) - T„ +2 (x). 


13 . 3.21 Show that 

dK„(x) T„(x) 

dx 

(a) using the trigonometric forms of V„ and T„, 

(b) using the Rodrigues representation. 


1 3 . 3.22 Starting with x = cos 9 and 7„(cos 9) = cos nO, expand 

\k 


and show that 


( e * + e -"> y 

X = ( 2 j 

= 2^1 + (J) T k-2(x) + Q r t . 4 (x) + • • • J, 


the series in brackets terminating with ( _ ) ?i (x) for k = 2m + 1 or 
for k — 2m. 


iA 


2Vm 




13 . 3.23 (a) Calculate and tabulate the Chebyshev functions V 1 (x), V 2 (x\ and F 3 (x) 

for x = — 1.0(0. 1)1.0. 

(b) A second solution of the Chebyshev differential equation, Eq. 13.79, for 
n — 0 is y(x) = sin -1 x. Tabulate and plot this function over the same range : 
-1.0(0. 1)1.0. 

1 3 . 3.24 Write a program that will generate the coefficients a 5 in the polynomial form of 
the Chebyshev polynomial, 7^(x) = 

13 . 3.25 Tabulate T 10 (x) for 0.00(0.01)1.00. This will include the five positive roots of 
T 10 . If a plotting subroutine is available, plot your results. 

1 3 . 3.26 Determine the five positive roots of T 10 (x) by calling a root-finding subroutine 
(compare Appendix 1). Use your knowledge of the approximate location of 
these roots from Exercise 13.3.25 or write a search routine to look for the roots. 
These five positive roots (and their negatives) are the evaluation points of the 
10-point Gauss-Chebyshev quadrature method (Appendix 2). 

x k - cos [(2k - 1 )tc/20], k = 1, 2, 3, 4, 5. 


13.4 CHEBYSHEV POLYNOMIALS— NUMERICAL 
APPLICATIONS 


In contrast with the Legendre, Hermite, and Laguerre polynomials, the 
Chebyshev polynomials ( T n (x )) play no significant role in a direct description 
of the physical world. Their importance stems from a rapidly growing wealth 
of applications in numerical analysis. The following are examples: 
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a. Chebyshev polynomials. They provide a convenient 
and rather accurate approximation to a minimax 
approximation of a function over [ — 1,1]. This 
minimax approximation is an approximation in 
which the maximum magnitude of the error (of the 
approximation) is minimized. 

b. Numerical evaluation of integrals, Gauss-Chebyshev 
quadrature. Compare Appendix 2. 

c. A variety of miscellaneous applications, including 
matrix inversion and numerical integration of 
differential equations. 

Here we concentrate on (a), Chebyshev series and their use in approximating 
functions. 


TRIGONOMETRIC FORM 

From the preceding section 

7^(cos 0) = cos nO (13.93a) 

or 

T n (x) — cos(ncos -1 x). (13.936) 

From this trigonometric form we obtain the properties that make these ortho- 
gonal polynomials so useful in numerical analysis (over the orthogonality 
interval [ — 1, 1]). 

a . | T „(*)|<1 

b. Max T n (x) = + 1, min T n (x) ~ — 1 (13.94) 

for all maxima and minima. This leads to the 
equiripple property discussed later. 

c. The maxima and minima are spread reasonably 
uniformly over the range [ — 1, 1]. 


Chebyshev Series 

The representation of a function f(x) by a series of Chebyshev polynomials 
has some significant advantages over the regular power series: (1) The conver- 
gence is much more rapid, 1 (2) the technique of telescoping series to obtain a 
more compact representation is opened up, and (3) a minimax approximation 
is approached. 

From 

fix) = X a„7„(x) (13.95) 

n = 0 


the coefficients a n can be calculated by using the orthogonality of the Chebyshev 
polynomials and the normalization, Eq. 13.87. We obtain 


a 


n 


2 

n 


f(x)T„(x)( 1 - x 2 ) 1/2 dx, 

-1 


n = 1, 2, 3, 


(13.96) 


The basic theorem was proved by Chebyshev. 
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and half of this for a 0 . This anomalous behavior of the first coefficient is repeated 
in the Fourier cosine series of Chapter 14. Note that this is a least-squares fit. 

Actually the Chebyshev series is a Fourier cosine series in disguise. With 
Eq. 13.93a, Eq. 13.95 becomes 

/( cos0)= £ a„cosnO , (13.97) 

n = 0 

similar to Eq. 14.1. 

If /(x) is a finite power series (polynomial), the Chebyshev coefficients may 
be determined by other techniques that are faster and more accurate than the 
direct integration of Eq. 13.96. We have 

£ b„x" = £ a n T n (x). (13.98) 

n=0 n=0 

The equality of upper limits n = N is plausible if we recall that T n (x) has x" as 
its highest power. The 7^(x) then are a reordering of the powers of x appearing 
in the power series. This argument can be made rigorous by mathematical 
induction or the Gram-Schmidt orthogonalization of Section 9.3. 

With the power-series coefficients b n known, there are various techniques 
for determining the unknown Chebyshev coefficients, a„. 

Matrix Multiplication In direct analogy with Exercise 12.2. 1 for Legendre 
polynomials we can set up the Chebyshev transformation matrix and obtain 
the a n coefficients by matrix multiplication. 

We may write 

x n =t c„T t , (13.99) 

5 = 0 

with the c ns tabulated in AMS-55, Table 22.3. Substituting into Eq. 13.98 
(with the dummy index n on the right replaced by s) and equating coefficients 
of the same T s , we obtain 

<&«| (O = <a s |, 

where (b n | and <a s | are row vectors (bars) and c ns is a matrix, actually lower 
left triangular. Taking the adjoint 

M*.> = K>, (13.100) 

we have | b„ ) and |a s > as column vectors (kets). The power series to Chebyshev 
series transformation (c sn ), now an upper right triangular matrix, is given by 


A 

0 

1 

2 

0 

1 

0 \ 

f 0 

1 

0 

3 

4 

0 


1 0 

0 

2 

0 

4 

8 

0 1 

1 0 

0 

0 

1 

4 

0 

A 1 

V ° 

0 

0 

0 

1 

8 

0 J 

\o 

0 

0 

0 

0 

16/ 


(c J = 


(13.101) 
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The right-hand column of this matrix is taken from 

x 5 = ^{107i(x) + 5 T 3 (x) + 1 T s (x)}, 

a special case of Eq. 13.99 for n — 5. For n > 0 the nth column contains a 
factor of 1/2" ^ which may be factored out. 

A significant limitation of this matrix transformation technique, Eq. 13.100, 
is that the matrix size and therefore the upper limit N is fixed. In the preceding 
case N = 5. If you wish to handle N = 6, then the coefficients of x 6 in Eq. 13.99 
must be added on the right (and zeros at the bottom). 

Fast Fourier Transform Method This is discussed in Chapter 14. 
Recurrence Relation Iteration This technique is discussed subsequently. 

Power Series to Chebyshev Series 

Let us rewrite our polynomial in a nested multiplication form 

f(x) = b 0 + x{b l + • • • + x{b N - 2 + x(b N - 1 + xb N ))). (13.102) 

We employ, Eq. 13.69, the recurrence relation as 

xT n (x) = K +1 (x) + n= 1,2,... (13.103) 

and 

xT 0 = T 1 (for n = 0). (13.104) 

Starting with the innermost parentheses, we obtain 

b N - j + xb N = b N _ j T 0 (x) + b N l\(x) (13.105) 

(from Table 13.3). Multiplying by x and using Eqs. 13.103 and 13.104, we have 

b N _ 2 + x (^n ~ i + x ^n) = b N - 2 T 0 + x(b N _ 1 T 0 + b N T { ) 

i / (13.106) 

— b N - 2 F 0 + 6 iV _ 1 T 1 + j b N T 0 + 2^n^2* 

Collecting coefficients, we get 

b N -2 + x{b N - 1 + *b N ) = (b N ^ 2 + \b N )T 0 + h iV _ 1 T x + ^b N T 2 . (13.107) 

Schematically, we have (coefficient of T n in the column labeled T„, each row down 
giving the result of one more iteration): 

T 0 T x T 2 T 3 


U N - 1 



bfsi~ 3 + 2 bft—i b N S^*+ 2 b^ + \b N 2 b^—i 

The coefficient of T N will be a N = 2 ~ {N ~ 1] b N . 

Note the following features: 
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1. In the mth row a N - m is added into the T 0 column. 

(b N appears in the T t column in the first row.) 

2. The T 0 coefficient of one row is shifted to the T x 
column in the next row down (solid arrows). 

3. All other entries (T l9 T 2 , . . . columns) are shifted to 
both right and left but with a coefficient of ^ — in 
accordance with Eq. 13.103. (Dotted arrows). 

This procedure continues until the last coefficient b 0 has been fed into the T 0 
column and that row is complete. The number then appearing in the T m column 
is its coefficients a m . As a computing program this procedure is fast and accurate. 
It also has the advantage of not requiring any knowledge of the coefficients of 
the Chebyshev polynomials (beyond T 0 and 7^). 

Telescoping Series (Economization) 

Suppose that coshx is represented in the interval [ — 1,1] by the truncated 
Maclaurin series 


coshx% £ b 2n x ln , (13.108) 

n = o 

with b 2l | = l/(2n)!. Since the coefficients form a rapidly decreasing sequence, 
the maximum error (at x = 1) is approximately the first term omitted —1/(14)! = 
1.147 x 10” 11 . Transforming to a series of Chebyshev polynomials through 
T 12 (x), we obtain 

cosh* as £ b 2n x 2n = £ a 2n T 2n (x). (13.109) 

n-0 n = 0 

The Maclaurin series coefficients and the Chebyshev coefficients are shown in 
Table 13.6. The ratio a 2 Jb 2n is also included — to exhibit the much more rapid 
convergence of the corresponding Chebyshev series. 


TABLE 13.6 


n 

Maclaurin series 
coefficients 0 

Eq. 13.108 

Chebyshev series 
coefficients 0 

Eq. 13.109 

a 2n!b 2n 

Chebyshev 

Maclaurin 

0 

1.000 X 10° 

1.26606587 77496 

1.27 

2 

5.000 x 10 -1 

0.2714953395298 

5.43 x 10’ 1 

4 

4.167 x 10' 2 

0.0054 742404393 

1.31 x 10' 1 

6 

1.389 x 10' 3 

0.00004497 73215 

3.24 x 10" 2 

8 

2.480 x 10' 5 

0.0000001992120 

8.03 x 10“ 3 

10 

2.756 x 10' 7 

0.0000000005505 

2.00 x 10' 3 

12 

2.088 x 10' 9 

0.0000000000010 

4.88 x 10~ 4 


6 

cosh X & Y, b 2 n x2n 

n=0 

= 1 a 2n T 2n {x). 

n—O 



All coefficients are calculated to 13 decimal accuracy. 
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TABLE 13.7 Approximations to cosh x 


n 

'Lb 2n x 2n 

Seven-term 

Maclaurin series 

K 

Telescoped to 
six terms 

K 

X b" 2n x ln 

Telescoped to 
five terms 

K 

0 

1.0000 00000000 

0.9999 99999999 

1.000000000549 

2 

0.50000000 0000 

0.500000000073 

0.49999997 2550 

4 

0.0416 6666 6667 

0.0416 66665810 

0.04166688 5995 

6 

0.0013 8888 8889 

0.0013 88892542 

0.0013 8827 6026 

8 

0.0000 24801587 

0.0000 2479 4541 

0.0000 25499132 

10 

0.00000027 5573 

0.000000281836 

— 

12 

0.00000000 2088 

— 

-- 

Maximum 

( — ) 1 . 147 x 10 -11 

1.3 x 10 “ n 

5.6 x 10~ 10 

error 

Maximum 
error in 

( — ) 2.1 x 10‘ 9 

Maclaurin series of same number of terms 

( — ) 2.8 x 10~ 7 


The final ratio a 12 /b l2 is 2 -11 , as expected, to within the accuracy of the 
Chebyshev coefficient. 

Now the final term in this seven-term Chebyshev series is 1.0 x 10“ 12 T 12 (x), 
with the maximum magnitude of 1.0 x 10“ 12 by Eq. 13.94. Since our original 
approximation of coshx (Eq. 13.108) is accurate only to 1.1 x 10 n , this T 12 
term may be dropped without any significant loss of accuracy ! If desired, the 
shortened six-term Chebyshev series may be transformed back into a power 
series through x 10 . And this telescoped power series has essentially the same 
accuracy as the original series-through x 12 . 

This process of dropping the highest order term of the Chebyshev series 
(telescoping) may be continued as desired. Table 13.7 gives the resulting power- 
series coefficients. 

The maximum error in the six-term telescoped series is comparable to the 
maximum error in the original seven-term series. The maximum error in the 
five-term telescoped series is appreciably less than the maximum error in the 
six-term Maclaurin series. This process of telescoping reduces the maximum 
error (comparing telescoped and Maclaurin series of the same number of terms) 
and distributes it more uniformly across the interval [—1,1] instead of con- 
centrating it at x = ±1. For a fixed number of terms we have approached a 
minimization of the maximum error — a minimax approximation. This redistri- 
bution of the error (shown in Fig. 13.7) is given approximately by the last 
b m T m (x) dropped — approximately equiripple. 

Our Chebyshev approximations 

coshx ss Yj b' 2 n x2n (13.108a) 

~ Z &2„* 2 " 

n= 0 


(13.1086) 
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FIG. 13.7 Errors in representations of cosh x: 

(T) Error in seven-term Maclaurin series telescoped to five terms. ( 2 ) 
Error in five-term Maclaurin series. ® Error in six-term Maclaurin 
series. 


of Table 13.7 are not exact minimax approximations nor is the error curve, 
Fig. 13.7, exactly equiripple. The approximation may be modified to be exactly 
minimax or the error exactly equiripple by iterative numerical techniques, but 
for almost all purposes the Chebyshev approximations will suffice. 


Shifted Chebyshev Polynomials 

Our Chebyshev polynomials are defined and are orthogonal over the specific 
interval [-1, 1]. Since any finite interval a<x<:b can be transformed into 
— 1 < t < 1 by the linear transformation 


b - a , b + a 
x = — — t H — , 


(13.110) 


the choice [— 1, 1] is perfectly general. However, it is often convenient to work 
in the interval [0, 1] and to define polynomials orthogonal over this interval. 
Following Eq. 13.110 we use T n (t) = T n (2x — 1) and define these to be the 
shifted Chebyshev polynomials T*(x): 
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T n *(x) = T„(2x; - 1), 0 < x < 1 n = 0, 1, 2, ... . (13.111) 

The shifted Chebyshev polynomials may be expressed in terms of an angle 0. 
We have 


2x — 1 = cos 0 


as the argument of T n . Then 


1 + cos 0 2 0 

x — — = cos z -. 

2 2 


(13.112) 


Since we have made a linear transformation in going from T n to T„*, we still 
have 


T*(x) = cos nO, 

but now x and 6 are related by Eq. 13.112. The properties of T*(x) may be 
derived from the corresponding T n (x ) properties. Again, because of the occa- 
sional usefulness of the shifted Chebyshev polynomials,' the IBM Scientific 
Subroutine Package (SSP), including appropriate subroutines, is provided. 


EXERCISES 


13.4.1 Derive the relations 


f 1 

o 


T*(x)T*(x) 


dx 

(x — x 2 ) 112 


0 m ^ n 

- m = n j= 0 

2 

7i m = n m 0. 


1 3.4.2 (a) Show that T*{x) = 1 and T* (x) = 2x - 1. 

(b) Derive the shifted Chebyshev polynomial recurrence relation 

T* +1 (x) = 2(2x - l)T*(x) - T*_ ,(x). 

With this recurrence relation and the results of part (a), all the other shifted 
Chebyshev polynomials can be developed. 


1 3.4.3 Develop the following Chebyshev expansions (for [—1,1]): 


(a) (1 - x 2 ) 1/2 = - 

n 


1 ~ 2 £ (4s 2 — 1) _1 T 2 s (x) 


(b) 


+ 1 , 


0 < x < 1 


— 1, — 1 < X < Oj 71 s % 

1 3.4.4 (a) For the interval [ — 1, 1] show that 


= -E(“l) s ( 25 Fir 1 T 2 s+ 1 (x). 


I x l = l+ £ ■ ; SH (4 s+1)P 2 ,(x) 


(2s + 2)!!' 

9 4 00 1 

= - + - K - i) s+ 1 tt~i r 2* (x )- 

71 71 4 S — 1 


(b) Show that the ratio of the coefficient of T 2s (x) to that of P 2s ( x ) approaches 
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(tc 5)“ 1/2 as s-> oo. This illustrates the relatively rapid convergence of the 
Chebyshev series. 

Hint. Legendre — with the Legendre recurrence relations, rewrite xP n (x) as 
a linear combination of derivatives. Chebyshev — the trigonometric substitu- 
tion x = cos 0 , T„(x) = cos nO is most helpful. 

13 . 4.5 Show that 

£ =1+21 (4s 2 - ir 2 . 

® s — 1 

Hint. Apply Parsevafs identity (or the completeness relation) to the results of 
Exercise 13.4.4. 


13 . 4.6 Show that 


(a) 

(b) 


“I 71 4 ^ 1 r-p / \ 

COS 1 X = - - - £ JZ ~T~7 \2 T 2» + lM- 
2 n n % (In + 1) 

sin -1 x = - £ | ,,i 7i„ +1 (x). 

n „= 0 ( 2 « + l) 2 


1 3 . 4.7 (a) Write a double precision subroutine that will transform a finite power series 
into a Chebyshev series ££L 0 tf,,7;(x). Use the recurrence relation 
iteration technique outlined in this section. 

(b) Call your subroutine to find the Chebyshev series coefficients for (1) e x > 
(2) e ~ x , (3) coshx, and (4) sinhx. Carry terms through T l2 (x). 

Note. Exercise 11.5.16 is a calculation of these Chebyshev coefficients in 
terms of modified Bessel functions, /„. 


1 3 . 4.8 (a) Using the double precision Chebyshev coefficients for sinhx from Exercise 
13.4.7 or 11.5.16 through a n T n , drop the a ll T ll term. Compare the error 
in your telescoped series with the error in (1) the original series and (2) the 
error in the Maclaurin series of the same number of terms as your telescoped 
series. Convert your new Chebyshev series into a power series. 

(b) Repeat part (a), dropping a 9 T 9 . Calculate the approximately equiripple 
error curve and compare with the error curve for the Maclaurin series 
through b 1 x 1 . 


13.5 HYPERGEOMETRIC FUNCTIONS 


In Chapter 8 the hypergeometric equation 1 

x(l - x)y"(x) + [c - (a + b + l)x]/(x) - aby(x) = 0 (13.113) 


was introduced as a canonical form of a linear second-order differential equation 
with regular singularities at x = 0, 1, and oo. One solution is 


y(x) = 2 F 1 (a,h,c;x) 

« a*b x a(a 4- 1 )b(b + 1) x 2 _ 

~ + VlT c(c+l) 2! ’ 


(13.114) 

c + 0, -1, -2, -3, .... 


J This is sometimes called Gauss’s differential equation. The solutions then 
become Gauss functions. 
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which is known as the hypergeometric function or hypergeometric series. 
The range of convergence |x| < 1 and x = 1, for c > a + b, and x = - 1, for 
c > a + b — 1. In terms of the often used Pochhammer symbol 


( a )n ~ a ( a + l)(a + 2) ••• (a + n — 1) = — — ^ — — — 

(a - 1)! 

ifl)o = U 

the hypergeometric function becomes 


2 F i(<* 9 b,c;x ) 


f (a)n(b) n x” 

n = 0 (c)n n ! ' 


(13.115) 


(13.116) 


In this form the subscripts 2 and 1 become clear. The leading subscript 2 
indicates that two Pochhammer symbols appear in the numerator and the 
final subscript 1 indicates one Pochhammer symbol in the denominator. 2 The 
confluent hypergeometric function 1 F 1 with one Pochhammer symbol in the 
numerator and one in the denominator appears in Section 13.6. 

From the form of Eq. 13.114 we see that the parameter c may not be zero 
or a negative integer. On the other hand, if a or b equals 0 or a negative integer, 
the series terminates and the hypergeometric function becomes a simple 
polynomial. 

Many more or less elementary functions can be represented by the hyper- 
geometric function. 3 We find 


ln(l + x) = x 2 F 1 (1, 1,2; — x). 
For the complete elliptic integrals K and E 


n/2 


K(k 2 ) = (1 — /c 2 sin 2 d) 1/2 di) 


= ^(U,i;/c 2 1, 


2 2 


E(k 2 ) = 


'nj2 


(1 — k 2 sin 2 0)' 12 dO 


1 1 


= ~ 2 FA^,-^Uk 


(13.117) 


(13.118) 


(13.119) 


The explicit series forms and other properties of the elliptic integrals are 
developed in Section §.8. 

The hypergeometric equation as a second-order linear differential equation 


2 The Pochhammer symbol is often useful in other expressions involving 
factorials, for instance, 

(1 - zY a = £ (a)„z n /n\ \z\ < 1 . 

n = 0 

3 With three parameters, a , b, and c, we can represent almost anything. 
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has a second independent solution. The usual form is 

y(x) = x 1 - c 2 F 1 (a 4- 1 -c,6 + 1 -c,2-c;x), c=£2,3,4, .... (13.120) 

The reader may show (Exercise 13.5.1) that if c is an integer either the two 
solutions coincide or (barring a rescue by integral a or integral 6 ) one of the 
solutions will blow up. In such a case the second solution is expected to include 
a logarithmic term. 

Alternate forms of the hypergeometric equation include 


A* 


(1 ~^ 


1—2 


- [(a + b + l)z 


(a + b + 1 - 2c)]— y 
az 


1 - z 


aby 


1—2 


(13.121) 


= 0, 


(1 


* 2 ) 


il 

^2 2 


y(* 2 ) - 


(2a 4" 26 -h l)z 4* 


1 - 2c 


dz 


y(z 2 ) — 4a6y(z 2 ) = 0. 


(13.122) 


Contiguous Function Relations 

The parameters a, 6, and c enter in the same way as the parameter n of Bessel, 
Legendre, and other special functions. As we found with these functions, we 
expect recurrence relations involving unit changes in the parameters a, 6, and 
c. The usual nomenclature for the hypergeometric functions in which one 
parameter changes by + or — 1 is a “contiguous function.” Generalizing this 
term to include simultaneous unit changes in more than one parameter, we 
find 26 functions contiguous to 2 F 1 (a,6,c;x). Taking them two at a time, 
we can develop the formidable total of 325 equations among the contiguous 
functions. One typical example is 

(a - b){c(a + 6 - 1) + 1 - a 2 - b 2 4- [(a - 6) 2 - 1](1 - x)} 2 F 1 (a,6,c;x) 

= (c — a) (a - b 4- l)b 2 F x (a - 1,6 4- l,c;x) (12.123) 

4- (c — 6)(a — 6 — 1 )a 2 F x (a 4- 1,6 — l,c;x). 

Another contiguous function relation appears in Exercise 13.5.10. 

Hypergeometric Representations 

Since the ultraspherical equation (13.81) in Section 13.3 is a special case of 
Eq. 13.113, we see that ultraspherical functions (and Legendre and Chebyshev 
functions) may be expressed as hypergeometric functions. For the ultraspherical 
function we obtain 

Tf(x) = -- 2 ^y|r^ (-">« + 20 + U + (13.124) 

For Legendre and associated Legendre functions 

PM = (13125) 
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nmt \ (n + m)l ( 1 - x 2 )'” /2 r ( . 1 — x\ nn , 

" W = {^m)l ' 2"'m! ~ 21 1" 1 “ ”’ m + n + + (13126) 

Alternate forms are 


(13.127) 


(13.128) 


In terms of hypergeometric functions the Chebyshev functions become 

r„(x) = 2 F/-n,n,|;~), (13.129) 

U„(x) = (n + l) 2 F^-n,n + 2,|;~T), (13.130) 

K.U) = V 1 - x 2 n 2 F i (~~ n + 1>h + (13-131) 

The leading factors are determined by direct comparison of complete power 
series, comparison of coefficients of particular powers of the variable, or evalua- 
tion at x = 0 or 1, and so on. 

The hypergeometric series may be used to define functions with nonintegral 
indices. The physical applications are minimal. 


EXERCISES 

1 3.5.1 (a) For c, an integer, and a and b , nonintegral, show that 

2 Fj (a, b, c ; x) and x 1 -c z Fi (n + 1 ~ c\b + 1 — e, 2 — c ; x) 

yield only one solution to the hypergeometric equation. 

(b) What happens if a is an integer, say, a — — 1, and c = —2? 

1 3.5.2 Find the Legendre, Chebyshev I, and Chebyshev II recurrence relations corre- 
sponding to the contiguous hypergeometric function equation (13.123). 

13.5.3 Transform the following polynomials into hypergeometric functions of argu- 
ment x 2 . (a) F 2 „(x); (b) x -1 T 2n+1 (x); (c) U 2 „(x); (d) x~‘ l/ 2n+l (x). 

ANS. (a) r 2 „(x) = (-ir 2 F,(-n,^;x 2 ). 

(b) x- , r 2 „ +1 (x) = (-l)"(2n+ 1) 2 F, (-„,«+ lj;x 2 ). 

(c) t/ 2 „(x) = (-l)" 2 F, (-«,« + l,i;x 2 ). 

(d) x 1 U 2n+ ,(x) = (— l)"(2»i + 2) 2 F,( — + 2,|;x 2 ). 


p ^ x) = { - lT 2^.M- n ' n + br x2 


D” (2 ” !, )!! 2 F .( + 


p 2»+iW = (~ 1 )” %r : rT x 2 F i + 


_ (2«+i)!! / 3 3 ; 

~ ( 1} (2n) 1 1 2 'l n ' n + rr x 
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13.5.4 

13.5.5 

13.5.6 

13.5.7 


13.5.8 

13.5.9 

13.5.10 


Derive or verify the leading factor in the hypergeometric representations of the 
Chebyshev functions. 


Verify that the Legendre function of the second kind, Q v (z), is given by 
^ / \ n 1/2 vl r (v , 1 V , , v , 3 _ 2 \ 

QAz) = (V + DW ‘ lFl (: 2 + 2’2 + U 2 + 2’ Z } 
|argz| < n, v^-1, -2, -3 


2 > 1 . 


Analogous to the incomplete gamma function, we may define an incomplete 
beta function by 

B x (a,f>)= jV'a-tr 1 *. 

Show that 

B x (a y b ) = a~ l x a 2 F 1 (a y 1 — 6, a + 1 ;x). 


Verify the integral representation 

i F l (a,b,c;z) = — fV‘(l - - tz)- dt. 

r{b)Y(c - b) J 0 

What restrictions must you place on the parameters b and c, on the variable z? 
Note. The restriction on \z\ can be dropped — analytic continuation. For 
nonintegral a the real axis in the z-plane 1 to oo is a cut line. 

Hint. The integral is suspiciously like a beta function and can be expanded into 
a series of beta functions. 

ANS. ®{c) > &{b) > 0, 
and |z| < 1. 


Prove that 


2 F 1 (a,h,c;l) = 


r(c)T(c -a- by 
T(c - a)T(c - by 


c ± 0, — 1, —2, , 


c > a + b. 


Hint. Here is a chance to use the integral representation, Exercise 13.5.7. 


Prove that 


2 F 1 (a,h,c;x) = (1 - x) a 2 F X 




Hint. Try the integral representation, Exercise 13.4.7. 

Note. This relation is useful in developing a Rodrigues representation of T„{x) 
(compare Exercise 13.3.11). 


Verify 

2 F l (-n,b,c; D = i£ 7 = p 2 - 

(C)n 

Hint. Here is a chance to use the contiguous function relation 
[2 a - c + (b - a)x]F(a, b, c; x) = a(l - x)F(a + 1, b, c; x) - (c - a)F(a - 1, b, 
c;x) and mathematical induction. Alternatively, you can use the integral 
representation and the beta function. 
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13.6 CONFLUENT HYPERGEOMETRIC FUNCTIONS 


The confluent hypergeometric equation 1 

xj;"(x) + (c — x)y'(x) — ay(x) = 0 (13.132) 


may be obtained from the hypergeometric equation of Section 13.5 by merging 
two of its singularities. The resulting equation has a regular singularity at x = 0 
and an irregular one at x = oc. One solution of the confluent hypergeometric 
equation is 

y(x) = iF^cix) = M(a,c;x) 


1 i x _ ! a ( a + 1 ) * 2 , 

c 1 ! c(c + 1 ) 2 ! 


c + 0, -1,-2, .... (13.133) 


This solution is convergent for all finite x (or z). In terms of the Pochhammer 
symbols, we have 

M(a,c;x)= (13.134) 

n% (<4 n ! 


Clearly, M(a,c;x) becomes a polynomial if the parameter a is 0 or a negative 
integer. Numerous more or less elementary functions may be represented by 
the confluent hypergeometric function. Examples are the error function and the 
incomplete gamma function. 


erf(x) = 


1/2 


71 


y(a,x) = e l t a 1 dt 

Jo 


e x *dt = ^xM( -x 


2 2 ’ 


— a 1 x a M(a,a + 1; — x), from Eq. 10.71 > 0. 


(13.135) 

(13.136) 


Clearly, this coincides with the first solution for c — 1. The error function and 
the incomplete gamma function are discussed further in Section 10.5. 

A second solution of Eq. 13.132 is given by 

y(x) = x 1 ~ c M(a + 1 — c,2 — c;x), c ^ 2,3,4, ... . (13.137) 


The standard form of the second solution of Eq. 13.132 is a linear combination 
of Eqs. 13.133 and 13.137. 


U(a,c;x ) 


n 

sin nc 


M(n, c;x) 

(a - c)\ {c — 1 )! 


x 1 c M{a + 1 — c\ 2 — r;x) 
1) ! ( 1 - c)\ 


(13.138) 


Note the resemblance to our definition of the Neumann function, Eq. 1 1.60. As 
with our Neumann function, Eq. 11.60, this definition of U(a,c;x) becomes in* 
determinate in this case for c an integer. 


lr This is often called Kummer’s equation. The solutions, then, are Rummer 
functions. 
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An alternate form of the confluent hypergeometric equation that will be 
useful later is obtained by changing the independent variable from x to x 2 . 

£* y{x2) + - 2x ]ic y{x2) ~ 4ay(x2) = °' (11139) 

As with the hypergeometric functions, contiguous functions exist in which the 
parameters a and c are changed by ±1. Including the cases of simultaneous 
changes in the two parameters, 2 we have eight possibilities. Taking the original 
function and pairs of the contiguous functions, we can develop a total of 28 
equations. 3 

Integral Representations 

It is frequently convenient to have the confluent hypergeometric functions in 
integral form. We find (Exercise 13.6.10) 

M(a,c;x ) = - ^ - - tf-" 1 A, »(c) > ®(a) > 0, (13.140) 

r(a)r(c-a)J Q 

U(a,c;x) — —1— I e~ x, t a - l (l + dt, @(x) > 0, ®(a) > 0. (13.141) 

n«)J 0 

Three important techniques for deriving or verifying integral representations 
are as follows: 

1. Transformation of generating function expansions 
and Rodrigues representations: The Bessel and 
Legendre functions provide examples of this ap- 
proach. 

2. Direct integration to yield a series: This direct tech- 
nique is useful for a Bessel functfon representation 
(Exercise 11.1 4 8) and a hypergeometric integral 
(Exercise 13.5.7). 

3. (a) Verification that the integral representation satis- 
fies the differential equation, (b) Exclusion of the 
other solution, (c) Verification of normalization. 

This is the method used in Section 11.6 to establish 
an integral representation of the modified Bessel 
function, K v (z). It will work here to establish Eqs. 

13.140 and 13.141. 

Bessel and Modified Bessel Functions 

Rummer’s first formula, 

M(a,c;x) = e x M(c — a,c; — x), (13.142) 


2 Slater refers to these as associated functions. 

3 The recurrence relations for Bessel, Hermite, and Laguerre functions are 
special cases of these equations. 
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is useful in representing the Bessel and modified Bessel functions. The formula 
may be verified by series expansion or use of an integral representation (compare 
Exercise 13.6.10). 

As expected from the form of the confluent hypergeometric equation and the 
character of its singularities, the confluent hypergeometric functions are useful 
in representing a number of the special functions of mathematical physics. For 
the Bessel functions 


,2v + 1 ; 2ixj , (13.143) 

whereas for the modified Bessel functions of the first kind, 

/ v (x) = + t 2v + 1;2 xY (13.144) 

Hermite Functions 

The Hermite functions are given by 

n 2 „(x) = (-ir^M^-«,tx 2 j, (13.145) 

H 2n+ i(x) = ( - 1 r - (2 - n n t - I (1 3. 1 46) 

using Eq. 13.139. 

Comparing the Laguerre differential equation with the confluent hyper- 
geometric equation, we have 

L n (x) ~ M( — n, 1 ;x). (13.147) 

The constant is fixed as unity by noting Eq. 13.35 for x = 0. For the associated 
Laguerre functions 


= 


V! 


M v + 


1 




d m 

dx m 




(n + m) ! 
n\m\ 


M( — n, m + 1 ;x). 


(13.148) 


Alternate verification is obtained by comparing Eq. 13.148 with the power-series 
solution (Eq. 13.41 of Section 13.2). Note that in the hypergeometric form, as 
distinct from a Rodrigues representation, the indices n and m need not be in- 
tegers .and, if they are not integers, L"'(x) will not be a polynomial. 


Miscellaneous Cases 

There are certain advantages in expressing our special functions in terms of 
hypergeometric and confluent hypergeometric functions. If the general behavior 
of the latter functions is known, the behavior of the special functions we have 
investigated follows as a series of special cases. This may be useful in determining 
asymptotic behavior or evaluating normalization integrals. The asymptotic be- 
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havior of Af(a,c;x) and (7(a,c;x) may be conveniently obtained from integral 
representations of these functions, Eqs. 13. 140 and 13. 141. The further advantage 
is that the relations between the special functions are clarified. For instance, an 
examination of Eqs. 13.145, 13.146, and 13.148 suggests that the Laguerre and 
Hermite functions are related. 

The confluent hypergeometric equation (13.132) is clearly not self-adjoint. 
For this and other reasons it is convenient to define 

M kfl (x) = e' xl2 x^ +l/2 M(fi — k + ^,2/z + l;x). (13.149) 

This new function M kfi (x) is a Whittaker function which satisfies the self-adjoint 
equation 

M^(x) + (- l - + k - 

The corresponding second solution is 

W kfl (x) = e~ xl2 x' t+il2 U(ii -k + $,2n + l;x). (13.151) 


+ M k(1 (x) = 0. (13.150) 


EXERCISES 


1 3 . 6-1 Verify the confluent hypergeometric representation of the error function 


erf(x) = ^M^,|-x 2 | 


13 . 6.2 Show that the Fresnel integrals C(x) and S(x) of Exercise 5.10.2 may be expressed 
in terms of the confluent hypergeometric function as 


1 3 . 6.3 By direct differentiation and substitution verify that 


C(x)+iS(x) = xM^,|;^). 

and substitution verify that 
= ax' a J e~ l t a ~ l dt — ax~ a y(a, x) 


actually does satisfy 


xy" + (« + 1 4- x)y' -f ay = 0. 


1 3 . 6.4 Show that the modified Bessel function of the second kind K v {x) is given by 

K v (x) = n il2 e~ x (2xyU(v + ^2v + l;2x). 

13 . 6.5 Show that the cosine and sine integrals of Section 10.5 may be expressed in 
terms of confluent hypergeometric functions as 

Ci(x) 4- isi(x) = -e ix U( 1, 1; — ix). 

This relation is useful in numerical computation of Ci(x) and si(x ) for large 
values of x. 


13 . 6.6 


Verify the confluent hypergeometric form of the Hermite polynomial # 2n+1 (x) 
(Eq. 13.146) by showing that 
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13.6.7 

13.6.8 

13.6.9 

13.6.10 

13.6.11 

13.6.12 

13.6.13 

ANS. 

13.6.14 


H 2 . 

c = | and argument x 2 , 


(b) 


lim 

x^O 




= (-!)' 


2(2/i 4- 1)! 


Show that the contiguous confluent hypergeometric function equation, 

(c — a)M(a — l,c;x) + (2a — c + x)M(a,c;x) — aM(a + l,c;x) = 0, 
leads to the associated Laguerre function recurrence relation (Eq. 13.44). 


Verify the Kummer transformations: 

(a) M(a,c;x) — e x M(c ~ a,c; —x) 

(b) U(a,c;x) = x i ~ c U(a-c+ 1,2 - c;x). 


Prove that 


(a) -^—M(a,c;x) = 4- n,b + n;x), 

dx n {b) n 

(b) ~^U(a,c;x) = (-1 ) n (a) n U{a + n,c + n;x). 
Verify the following integral representations: 


(a) M(a, c;x) = 

(b) U(a,c;x) = 


i me) > ma) > 0, 

r(a)r(c - a) J 0 

l T 00 

— e~ xt t a ~\ 1 + t) c ~ a - x dt , &(x) > 0, &(a) > 0. 


Under what conditions can you accept M(x) = 0 in part (b)? 


From the integral representation of M(a, c;x), Exercise 13.6.10(a), show that 
M(a, c;x) = e x M(c — a,c; — x). 

Hint. Replace the variable of integration t by 1 — s to release a factor e x from 
the integral. 


From the integral representation of U{a,c;x), Exercise 13.6.10(b), show that 
the exponential integral is given by 

£j(x) = e~ x U(l, l;x). 

Hint. Replace the variable of integration t in E l {x) by x(l + s). 


From the integral representations of M(a,c;x) and U(a,c;x) in Exercise 13.6.10 
develop asymptotic expansions of 

(a) M(a, c;x), 

(b) U(a,c; x). 

Hint. You can use the technique that was employed with K v (z), Section 11.6. 
r(c) e x \ (1 - a)(c - a) (1 - «)(2 - a)(c - a)(c - a + 1) , ] 

(a) The + 2b? + "'f 

,U, 1 f, , a(l + a - c) , a (a + 1)(1 + a- c)( 2 + a - c) , ] 

,b ' .v“V 1 !( — .t) + 2! 

Show that the Wronskian of the two confluent hypergeometric functions, 
M(a,c;x) and U(a,c;x) is given by 


MU' - M'U 


(c - 1)! c* 
(a — 1)! x f 


What happens if a is 0 or a negative integer? 
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13.6.15 


The Coulomb wave equation (radial part of the Schrodinger wave equation 
with Coulomb potential) is 




L(L ± 1 )" 
P 2 . 


y = 0. 


Show that a regular solution, y = FJq, p), is given by 

F^p) = C L (t])p L * l e~ if} M (L + 1 - %2L Hr 2;2ip). 


13.6.16 (a) Show that the radial part of the hydrogen wave function, Eq, 13.60, may 
be written as 

e~ <ul1 (ar ) L L*t£l i (ar ) 

- + ' “ ”- 2L + 2l "> 

(b) It was assumed previously that the total (kinetic + potential) energy E 
of the electron was negative. Rewrite the (unnormalized) radial wave 
function for the free electron E > 0. 

ANS. e +iccr,2 ((xr) L M{L 4- 1 — in,2L + 2, — iccr), outgoing wave. This 
representation provides a powerful alternative technique for 
the calculation of photoionization and recombination co- 
efficients. 


1 3.6.1 7 Show that the Laplace transform of M(a , c;x ) is 


^{M(a,c;x)}=- 1 F l 

5 



13.6.18 Evaluate 

(a) | [M ktl (x)] 2 dx 

(b) f°[AMx)] 2 d A 

Jo x 

where 2p = 0, 1, 2, . . . , k — p — £ = 0, 1, 2, . . . , a > - 2p — 1. 

ANS . (a) (2p)!2/c. 

(b) (2p) !. 

(c) (2p)\{2k) a . 
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14. 1 GENERAL PROPERTIES 


Fourier Series 

A Fourier series may be defined as an expansion of a function or representa- 
tion of a function in a series of sines and cosines such as 

a 00 00 

f(x) = ™ + Y a n cosnx + Y b n sinnx. (14.1) 

2 „=i 

The coefficients a 0 , a n , and b n are related to the given function f{x) by definite 
integrals: Eqs. 14.11 and 14.12. You will notice that a 0 is singled out for special 
treatment by the inclusion of the factor This is done so that Eq. 14.1 1 will apply 
to all a ny n = 0 as well as n > 0. 

The conditions imposed on /(x) to make Eq. 14.1 valid are that f(x) has only 
a finite number of finite discontinuities and only a finite number of extreme 
values, maxima, and minima. 1 Functions satisfying these conditions may be 
called piecewise regular. The conditions themselves are known as the Dirichlet 
conditions. Although there are some functions that do not obey these Dirichlet 
conditions, they may well be labeled pathological for purposes of Fourier expan- 
sions. In the vast majority of physical problems involving a Fourier series these 
conditions will be satisfied. In most physical problems we shall be interested in 
functions that are square integrable (in the Hilbert space L 2 of Section 9.4). In 
this space the sines and cosines form a complete orthogonal set. And this in turn 
means that Eq. 14.1 is valid in the sense of convergence in the mean. 

Expressing cos nx and sin nx in exponential form, we may rewrite Eq. 14.1 as 

/(*)= £ c„e ta * (14.2) 

n=~ oo 


in which 


and 


C n = iK ~ ibnl 

c-n = 2 (<*n + ib„)> n > 0, 

Co = l a 0- 


(14.3) 


‘These conditions are sufficient but not necessary. 
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Completeness 

The problem of establishing completeness may be approached in a number 
of different ways. One way is to transform the trigonometric Fourier series into 
exponential form and compare it with a Laurent series. If we expand /(z) in a 
Laurent series 2 (assuming/(z) is analytic), 

m = £ d H z\ (14.4) 

n~ — oo 

On the unit circle z = e w and 


f(z) =f(e‘ e ) = y d n e M . (14.5) 

n— — oo 

The Laurent expansion on the unit circle (Eq. 14.5) has the same form as the 
complex Fourier series (Eq. 14.2), which shows the equivalence between the two 
expansions. Since the Laurent series as a power series has the property of com- 
pleteness, we see that the Fourier functions, e mx , form a complete set. There is a 
significant limitation here. Laurent series and power series cannot handle dis- 
continuities such as a square wave or the sawtooth wave of Fig. 14.1. 

The theory of linear vector spaces provides a second approach to the com- 
pleteness of the sines and cosines. Here completeness is established by the 
Weierstrass theorem for two variables. 

The Fourier expansion and the completeness property may be expected, for 
the functions sinnx, cos nx, e inx are all eigenfunctions of a self-adjoint linear 
differential equation, 

y" + n 2 y = 0. (14.6) 


We obtain orthogonal eigenfunctions for different values of the eigenvalue n by 
choosing the interval [0, pn\ p an integer, to satisfy the boundary conditions in 
the Sturm-Liouville theory (Chapter 9). If we further choose p — 2, the different 
eigenfunctions for the same eigenvalue n may be orthogonal. We have 


*2n 

Jo 


sin mx sin nx dx = 

cos mx cos nx dx = 



m f 0, 

o, 

m = 0, 


m ^ 0, 

2 7T, 

m — n — 0, 


(14.7) 

(14.8) 


2n 


sin mx cos nx dx *= 0 


for all integral m and n. 


(14.9) 


Note carefully that any interval x 0 < x < x 0 + 2n will be equally satisfactory. 
Frequently, we shall use x 0 = — n to obtain the interval —n<x<n. For the 
complex eigenfunctions e ±inx orthogonality is usually defined in terms of the 
complex conjugate of one of the two factors, 


Section 6.5. 
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C 2k 

(e imx )*e inx dx = 2n5 m> „. (14.10) 

Jo 

This agrees with the treatment of the spherical harmonics (Section 12.6). 


Sturm-Liouville Theory 

The Sturm-Liouville theory guarantees the validity of Eq. 14. 1 (for functions 
satisfying the Dirichlet conditions) and, by use of the orthogonality relations, 
Eqs. 14.7, 8 and 9, allows us to compute the expansion coefficients 


1 f 2 * 

a„ = -\ /(f) cos nt dt, 

71 Jo 

1 C 2 * 

b n = - f(t) sin ntdt, n = 0, 1, 2, ... . 

n Jo 


(14.11) 

(14.12) 


This, of course, is subject to the requirement that the integrals exist. They do if 
f(t) is piecewise continuous (or square integrable). Substituting Eqs. 14.11 and 
14.12 into Eq. 14.1, we write our Fourier expansion as 


| f*2n 1 oo / rin f*2n 

/(x) = — f(t)dt + - Y ( cosnx f(t) cos nt dt + sin nx /(f) sin ntdt 

2n Jo "-A Jo Jo 

1 f % 2n i co r2n 

= X- f(t)dt + - £ /(t) cos n(t - x)dt, 

2n )o *»=lJo 


(14.13) 


the first (constant) term being the average value of f(x) over the interval [0, 2n~\. 
Equation 14.13 offers one approach to the development of the Fourier integral 
and Fourier transforms, Section 15.1. 

Another way of describing what we are doing here is to say that f(x) is part of 
an infinite-dimensional Hilbert space, with the orthogonal cosnx and sin nx as 
the basis. (They can always be renormalized to unity if desired.) The statement 
that cos nx and sin nx (n — 0, 1, 2, • • ♦ ) span this Hilbert space is equivalent to 
saying that they form a complete set. Finally, the expansion coefficients a n and 
b n correspond to the projections of f(x) with the integral inner products (Eqs. 
14.11 and 14.12) playing the role of the dot product of Section 1.3. These points 
are outlined in Section 9.4. 


Sawtooth Wave 

An idea of the convergence of a Fourier series and the error in using only a 
finite number of terms in the series may be obtained by considering the expan- 
sion of 


/(*) = 


x, 

X — 27t, 


0 < X < 71, 
n < x <, 2n. 


(14.14) 


This is a sawtooth wave, and for convenience we shall shift our interval from 
[0, 2 71] to [ — 7r, 7c]. In this interval we have simply /(x) = x. Using Eqs. 14. 1 1 and 
14.12, we may show the expansion to be 
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/(*) 



FIG. 14.1 Fourier representation of saw- 
tooth wave 


fix) = x = 2 


sinx — 


sin2x 

~ 2 ~ 


+ 


sin 3x 


+ (-ir +i 


sin nx 
n 



(14.15) 


Figure 14.1 shows f(x) for 0 < x < n for the sum of 4, 6, and 10 terms of the 
series. Three features deserve comment. 

1. There is a steady increase in the accuracy of the 
representation as the number of terms included is 
increased. 

2. All the curves pass through the midpoint y = 0 at 
x — n. 

3. In the vicinity of x = n there is an overshoot that 
persists and shows no sign of diminishing. 

As a matter of incidental interest, setting x = n/2 in Eq. 14.15 provides an 
alternate derivation of Leibnitz’s formula, Exercise 5.7.6. 


Behavior of Discontinuities 

The behavior at x = n is an example of a general rule that at a finite discon- 
tinuity the series converges to the arithmetic mean. For a discontinuity at 
x — x 0 the series yields 

f(x 0 ) = 2 [/(*o + ) +/(*o — )]. (14.16) 

the arithmetic mean of the right and left approaches to x = x 0 . A general proof 
using partial sums, as in Section 14.5, is given by Jeffreys and by Carslaw. The 
proof may be simplified by the use of Dirac delta functions — Exercise 14.5.1. 

The overshoot just before x = n is an example of the Gibbs phenomenon, 
discussed in Section 14.5. 


Summation of a Fourier Series 

Usually in this chapter we shall be concerned with finding the coefficients of 
the Fourier expansion of a known function. Occasionally, we may wish to 
reverse this process and determine the function represented by a given Fourier 
series. 

Consider the series ^° =1 £cos nx, (0, 2k). Since this series is only con- 
ditionally convergent (and diverges at x = 0), we take 
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S cos nx .. S r"cosnx 

I — ~ = '3 I — 


n = l 


r-l " 


(14.17) 


absolutely convergent for \r\ < 1. Our procedure is to try forming power series 
by transforming the trigonometric functions into exponential form: 


g, r"cosnx _ 1 ^ r n e inx 1 * r n e~ inx 

n=i « 2„ti n + 2„t' 1 n 


(14.18) 


Now these power series may be identified as Maclaurin expansions of 
-In (1 - z), z = re 1 '*, re~ ix (Eq. 5.95), and 


z 

M = 1 


= -|[ln(l - re ix ) + ln(l — re, ix )] 
= — ln[(l 4- r 2 ) — 2rcosx] 1/2 . 


(14.19) 


Letting r = 1, 


v coswx . ~ u/2 

> = -ln(2 — 2cosx)^ 

n i n 


= - In |2 sin -j, (0,27t). 3 

Both sides of this expression diverge as x -► 0 and 2n. 


(14.20) 


EXERCISES 


14.1.1 


A function f(x ) (quadratically integrable) is to be represented by a finite Fourier 
series. A convenient measure of the accuracy of the series is given by the integrated 
square of the deviation 

A„ = | [/(x) - ~ - £ (a„ cos nx + b„ sin nx) J dx. 

Show that the requirement that A p be minimized, that is, 




for all n , leads to choosing a n and b n , as given in Eqs. 14.1 1 and 14.12. 

Note. Your coefficients a n and b n are independent of p. This independence is 
a consequence of orthogonality and would not hold for powers of x, fitting a 
curve tyith polynomials. 


1 4.1 .2 In the analysis of a complex waveform (ocean tides, earthquakes, musical tones, 
etc.) it might be more convenient to have the Fourier series written as 


3 The limits may be shifted to £ — 7r, tt] (and x =£ 0) using |x| on the right-hand 
side. 
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/(*) = y+ £ cos (nx - 0„). 

Show that this is equivalent to Eq. 14.1 with 

a„ = a„cos0„, a * = af; + bj;, 

b„ = a„sin 6„, tan 0„ = bja n . 

Note. The coefficients a 2 as a function of n define what is called the power spec- 
trum. The importance of a„ 2 lies in its invariance under a shift in the phase 0„. 

1 4.1 .3 A function f(x) is expanded in an exponential Fourier series 

/(*)= £ c„e'"*. 

n= -ao 

If f(x) is real, f(x) = /*(x), what restriction is imposed on the coefficients c„? 

14.1 .4 Assuming that \ K _ K f(x)dx and [/(x)] 2 dx are finite, show that 

lim a m ~ 0, lim b m = 0. 

m-* oo m-*oo 

Hint. Integrate [/(x) — s„(x)] 2 , where s„(x) is the nth. partial sum and use Bessel’s 
inequality, Section 9.4. For our finite interval the assumption that j\x) is square 
integrable {$- n \f(x)\ z dx is finite) implies that §l n \f(x)\dx is also finite. The 
converse does not hold. 

Ax) 



FIG. 14.2 

1 4.1 .5 Apply the summation technique of this section to show that 

y sin nx __ j j(n — x), 0 < x < n 

„ = ! n [-y(7T + x), -71<X<0 

(Fig. 14.2). 

1 4. 1 . 6 Sum the trigonometric series 



766 FOURIER SERIES 


y ( jjn+isinftx 
*=i n 

and show that it equals x/2. 

1 4 . 1 .7 Sum the trigonometric series 

y sin(2n + l)x 
n %~2n+l ' 

and show that it equals 

j 7t/4, 0 < x < n 

[ — 7t/4, — n < x < 0. 


1 4.1 .8 Calculate the sum of the finite Fourier sine series for the sawtooth wave, f(x) — x, 
_( — 7c, tc), Eq. 14.15. Use 4-, 6-, 8-, and 10-term series and x/n = 0.00(0.02)1.00. 
If a plotting routine is available, plot your results and compare with Fig. 14.1. 


14.2 ADVANTAGES, USES OF FOURIER SERIES 

Discontinuous Function 

One of the advantages of a Fourier representation over some other represen- 
tation, such as a Taylor series, is that it may represent a discontinuous function. 
An example is the sawtooth wave in the preceding section. Other examples are 
considered in Section 14.3 and in the exercises. 

Periodic Functions 

Related to this advantage is the usefulness of a Fourier series in representing 
a periodic function. If f(x) has a period of 2n, perhaps it is only natural that we 
expand it in a series of functions with period 2n, 2n/2, 2 k/3, .... This guarantees 
that if our periodic f(x) is represented over one interval [0, 2rc] or [ — tt, tt] the 
representation holds for all finite x. 

At this point we may conveniently consider the properties of symmetry. Using 
the interval [ — n, 7c], sinx is odd and cosx is an even function of x. Hence by 
Eqs. 14.11 and 14.12, 1 if /(x) is odd, all a n — 0 and if/(x) is even all b n = 0. In 
other words, 

a 00 

f(x) = ~ cos nx, f(x) even, (14.21) 

00 

f{x) = sinnx, /(x) odd. (14.22) 

n = 1 

Frequently these properties are helpful in expanding a given function. 

We have noted that the Fourier series is periodic. This is important in con- 
sidering whether Eq. 14. 1 holds outside the initial interval. Suppose we are given 
only that 

1 With the range of integration — n < x < n. 
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fix) 



FIG. 14.3 Comparison of Fourier cosine series, Fourier sine series, and Taylor 

series 

/(x) = x, 0 < x < 7i (14.23) 

and are asked to represent f(x) by a series expansion. Let us take three of the 
infinite number of possible expansions. 

1. If we assume a Taylor expansion, we have 

f(x) = x, (14.24) 

a one-term series. This (one-term) series is defined for 
all finite x. 

2. Using the Fourier cosine series (Eq. 14.21), we predict 
that 

f(x) — — x, — n < x < 0, 

(14.25) 

/(x) = 2n — x, 7i < x < 2k. 

3. Finally, from the Fourier sine series (Eq. 14.22), we 
have 

f(x) — x, — n < x < 0 , 

(14.26) 

f(x) ~ x — 271, 7i < x < In. 

These three possibilities, Taylor, series, Fourier cosine series, and Fourier sine 
series, are each perfectly valid in the original interval [0, tl]. Outside, however, 
their behavior is strikingly different (compare Fig. 14.3). Which of the three, then, 
is correct? This question has no answer, unless we are given more information 
about /(x). It may be any of the three or none of them. Our Fourier expansions 
are valid over the basic interval. Unless the function /(x) is known to be periodic 
with a period equal to our basic interval, or (l/n)th of our basic interval, there is 
no assurance whatever that the representation (Eq. 14.1) will have any meaning 
outside the basic interval. 
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It should be noted that the set of functions cosnx, n = 0, 1,2, , forms a 

complete orthogonal set over [0, n\. Similarly, the set of functions sinrcx, 
n— 1, 2, 3, , forms a complete orthogonal set over this same interval. Unless 

forced by boundary conditions or a symmetry restriction, the choice of which 
set to use is arbitrary. 

In addition to the advantages of representing discontinuous and periodic 
functions, there is a third very real advantage in using a Fourier series. Suppose 
that we are solving the equation of motion of an oscillating particle subject to a 
periodic driving force. The Fourier expansion of the driving force then gives us 
the fundamental term and a series of harmonics. The (linear) differential equa- 
tion may be solved for each of these harmonics individually, a process that may 
be much easier than dealing with the original driving force. Then, as long as the 
differential equation is linear, all the solutions may be added together to obtain 
the final solution. 2 3 This is more than just a clever mathematical trick. It cor- 
responds to finding the response of the system to the fundamental frequency and 
to each of the harmonic frequencies. 

One question that is sometimes raised is, “Were the harmonics there all along 
or were they created by our Fourier analysis?” One answer compares the func- 
tional resolution into harmonics with the resolution of a vector into rectangular 
components. The components may have been present in the sense that they may 
be isolated and observed, but the resolution is certainly not unique. Hence many 
authorities prefer to say that the harmonics were created by our choice of expan- 
sion. Other expansions in other sets of orthogonal functions would give different 
results. For further discussion the reader should consult a series of notes and 
letters in the American Journal of Physics? 


Change of Interval 

So far attention has been restricted to an interval of length 2n. This restriction 
may easily be relaxed. If f(x) is periodic with a period 2 L, we may write 


with 


f(x) = ^+ l 

Z n = l 


nnx , , 

a n cos — — I- b n sin 

J-j 


nnx 

~L 


a„ = -jrj f(t) cos fdt, 
K = £ | f(t) sin ~dt, 


n — 0, 1, 2, 3, ... , 

n = 1,2, 3, , 


(14.27) 


(14.28) 

(14.29) 


2 One of the nastier features of nonlinear differential equations is that this 
principle of superposition is not valid. 

3 B. L. Robinson, “Concerning frequencies resulting from distortion,” Am. J. 
Phys. 21, 391 (1953). 

F. W. Van Name, Jr., “Concerning frequencies resulting from distortion,” Am. J. 
Phys. 22, 94 (1954). 
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replacing x in Eq. 14.1 with nxjL and t in Eqs. 14.1 1 and 14.12 with ntjL. (For 
convenience the interval in Eqs. 14.11 and 14.12 is shifted to — n < t < n.) The 
choice of the symmetric interval ( — L, L) is not essential. For f(x) periodic with a 
period of 2 L, any interval (x 0 ,x 0 + 2 L) will do. The choice is a matter of con- 
venience or literally personal preference. 


EXERCISES 


14.2.1 The boundary conditions (such as ij/(0) = = 0) may suggest solutions of the 

form sin(w7cx//) and eliminate the corresponding cosines. 

(a) Verify that the boundary conditions used in the Sturm-Liouville theory 
are satisfied for the interval (0, /). Note that this is only half the usual Fourier 
interval. 

(b) Show that the set of functions cp n {x ) — sm(nnx/l), n = 1, 2, 3, ... satisfies 
an orthogonality relation 

f / 

<P m {x)<Pn(x) dx=- d„„ , n > 0. 

Jo 1 


14.2.2 (a) 


Expand /(x) = x in the interval (0,2L). Sketch the series you have found 
(right-hand side of Ans.) over ( — 2 L, 2 L). 


ANS. 


x 




(b) 


Expand /(x) = x as a sine series in the half interval (0, L). Sketch the series 
you have found (right-hand side of Ans.) over ( — 2 L, 2 L). 


ANS. 


jj °° 

X = — K-D' 

„=i 



14.2.3 In tome problems it is convenient to approximate sin nx over the interval [0, 1] 
by a parabola ax(l — x), where a is a constant. To get a feeling for the accuracy 
of this approximation, expand 4x(l — x) in a Fourier sine series: 


/(*) = 


4x(l — x), 
4x(l + x), 


(Fig. 14.4). 


0 < x < 1) 
- 1 < x < Oj 


00 

£ b„sinnnx 

n- 1 


ANS. b„ 

K 


32 

7T 3 V 

0, 


n odd 
n even 


4 


Ax) 





X 


FIG. 14.4 



770 FOURIER SERIES 


/(*) 



FIG. 14.5 Square wave 

14.3 APPLICATIONS OF FOURIER SERIES 


EXAMPLE 14.3.1 Square Wave — High Frequencies 


One simple application of Fourier series, the analysis of a “square” wave 
(Fig. 14.5) in terms of its Fourier components, may occur in electronic circuits 
designed to handle sharply rising pulses. Suppose that our wave is defined by 


f{x) = 0, — n < x < 0, 

/( x) = h, 0 < x < n. 

From Eqs. 14.11 and 14.12 we find 
1 


— 


hdt = h. 


a n ~ I h cos ntdt — 0, n — 1, 2, 3, ... , 




1 

71 

2 h 


hsinntdt = — (1 — cos wr); 
nn 


"° dd ’ 


K = o, 

The resulting series is 


n even. 


h 2 h ( sin x sin 3x sin 5x 




+ 




+ 


(14.30) 

(14.31) 

(14.32) 

(14.33) 

(14.34) 

(14.35) 

(14.36) 


Except for the first term which represents an average of j\x) over the interval 
[ — 71, 7r], all the cosine terms have vanished. Since /(x) — hj 2 is odd, we have a 
Fourier sine series. Although only the odd terms in the sine series occur, they 
fall only as n~ 1 . This is similar to the convergence (or lack of convergence) 
of the harmonic series. Physically this means that our square wave contains a 
lot of high-frequency components. If the electronic apparatus will not pass these 
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/(' ) 



FIG. 14.6 Full wave rectifier 


components, our square wave input will emerge more or less rounded off, 
perhaps as an amorphous blob. 

EXAMPLE 14.3.2 Full Wave Rectifier 


As a second example, let us ask how well the output of a full wave rectifier 
approaches pure direct current (Fig. 14.6). Our rectifier may be thought of as 
having passed the positive peaks of an incoming sine wave and inverting the 
negative peaks. This yields 


/(f) = sin cot , 0 < cot < 7i, 

/(f) = — sinrnf, —7i < cot < 0. 


(14.37) 


Since /(f) defined here is even, no terms of the form sin mot will appear. 
Again, from Eqs. 14.11 and 14.12, we have 

"0 

i i 

sin (ot d((ot ) 


1 


= 2 
71 

2 

a„ = — 


— sin cot d( cot) H — 

71 


(14.38) 


sinmf d(cot) = 

71 


sin cot cos ncot d(cot) 
o 

2 2 


Tin 2 - V 


0, 


n even 


n odd. 


(14.39) 


Note carefully that [0, 7i] is not an orthogonality interval for both sines and 
cosines together and we do not get zero for even n. The resulting series is 


7 4 00 

/(f) = --- X 

n n 


cos not 

~ ' 2 r 


(14.40) 


* n — 2,4,6, . . . 

The original frequency o has been eliminated. The lowest frequency oscillation 
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is 2co. The high-frequency components fall off as n 2 , showing that the full 
wave rectifier does a fairly good job of approximating direct current. Whether 
this good approximation is adequate depends on the particular application. 
If the remaining ae components are objectionable, they may be further sup- 
pressed by appropriate filter circuits. 

These two examples bring out two features characteristic of Fourier 
expansions. 1 

1. If/(x) has discontinuities (as in fhe square wave in 
Example 14.3.1), we can expect the nth coefficient to 
be decreasing as 1 //?. Convergence is relatively slow. 2 

2. If f(x) is continuous (although possibly with dis- 
continuous derivatives as in the full wave rectifier of 
Example 14.3.2), we can expect the nth coefficient to 
be decreasing as 1/n 2 . 

EXAMPLE 14.3.3 Infinite Series, Riemann Zeta Function 


As a final example, we consider the purely mathematical problem of expand- 
ing x 2 . Let 


J\x) = x 2 , —n<x<n. 

By symmetry all b n = 0. For the u „' s we have 


Cln = 


2 / 27T 2 

* dx = ~J ' 


1 

a " = n 


x 2 cos nx dx 


o 

2 2ti 

= :•(- ir-f 

n n 


= (-i r--f. 

ir 


From this we obtain 


* 2 = -T + 4 X(-ir C9 -" x 

3 ■=! «“ 


(14.41) 


(14.42) 


(14.43) 


(14.44) 


As it stands, Eq. 14.44 is of no particular importance, but if we set x = n, 

COS/77T = ( - 1) M (14.45) 

and Eq. 14.44 becomes 3 


'G. Raisbeck, “Order of Magnitude of Fourier Coefficients," Am. Math. 
Monthly 62. 149- 155 (1955). 

2 A technique for improving the rate of convergence is developed in the 
exercises of Section 14.4. 

3 Note that the point x = n is not a point of discontinuity. 
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_2 oo 1 


n = l 1 


(14.46) 


or 



C(2), 


(14.47) 


thus yielding the Jliemann zeta function, £(2), in closed form (in agreement with 
the Bernoulli number result of Section 5.9). From our expansion of x 2 and 
expansions of other powers of .x numerous other infinite series can be evaluated. 
A few are included in the subsequent list of exercises. 


Fourier Series 


Reference 


GO | 

1. Y -sin nx 

~ n 

n = 1 


— j(n + x), — n < x < 0 

n — x), 0 < x < n 


,1 • 


2. £ (-l)" +1 -sinnx=ix, —n<x<n 


n = 1 


3 - L , --j-sin(2« + l)x = < 
n =o 2n + l i 

4. Y - cos nx = — In ["2 sin 

h n L v 2 jj 

00 1 I 

5. y ( — 1)"- cos nx = — In 2 

n = 1 ^ L 


[ — 7r/2, — 71 < X < 0 

+ 7t/ 2, 0 < x < n 


—n<x<n 


00 1 \ 

6. y - — — cos(2n -F l)x = - In . 
n =o 2n + 1 2 |_ 


x 

cot 1 - 1 


— n < x <n 


— 71 < X < 71 


Exercise 14.1.5 
Exercise 14.3.3 

Exercise 14.1.6 
Exercise 14.3.2 

Exercise 14.1.7 
Eq. 14.36 

Eq. 14.20 
Exercise 14.3.15 

Exercise 14.3.15 


Complex Variables — Abel's Theorem 

Consider a function f(z) represented by a convergent power series 

00 00 

f(z) = y c„z" = y c„r"e in0 . (14.48) 

n-0 n - 0 

This is our Fourier exponential series, Eq. 14.2. Separating real and imaginary 
parts 

00 

u(r,0)= y c n r n cos nO 

" =0 (14.49) 

00 

v(r , 0) = y c n r n sin nO , 

n = 1 

the Fourier cosine and sine series. Abel’s theorem asserts that if u( 1,0) and 
t?(l, 0) are convergent for a given 0 , then 

m( 1, 0) + iT(l, (1) = lim /(re' 0 ). (14.50) 

An application of this appears as Exercise 14.3.15. 
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EXERCISES 


14.3.1 


14.3.2 


Develop the Fourier series representation of 

fO, — 71 < (Ot < 0. 


m = 


sin wr. 


0 < cot < n. 


This is the output of a simple half-wave rectifier. It is also an approximation 
of the solar thermal effect that produces “tides 1 ' in the atmosphere. 

n l' 11- 2 £ cos ncot 

ANS. J(t) = - + - sin cot - - V — - 7 

n 2 7t „, 2 . 4 . 6 ... 

even 

A sawtooth wave is given by 


n 2 - 1 


fix) = x, 


-n < x < n. 


Show that 


00 / _ 1 ) n+ 1 

Jlx) = 2 X -■ — sinnv. 
n 


A different sawtooth wave is described by 

+ a), 

+ }(7T - X), 


14.3.3 

/(x) = 

Show that /(x) = X^,(sinnx/n). 

14.3.4 A triangular wave (Fig. 14.7) is represented by 

Ax) = 


n < x < 0 

0 < X < 71. 


X, 
— X, 


0 < X < 7T 
— 71 < X < 0. 


Represent /(x) by a Fourier series. 


ANS. j\x) 


4 

K 


cos nx 


-1.35.... 

odd 


fix) 



FIG. 14.7 Triangular wave 
14.3.5 Expand 


Ax) = 


x 2 < x, 2 , 
x 2 > x 2 


in the interval [ — rc, 7i]. 
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FIG. 14.8 


Note. This variable width square wave is of some importance in electronic 
music. 

14.3.6 A metal cylindrical tube of radius a is split lengthwise into two nontouching 
halves. The top half is maintained at a potential + K the bottom half at a 
potential - V (Fig. 14.8). Separate the variables in Laplace’s equation and solve 
for the electrostatic potential for r < a. Observe the resemblance between your 
solution for r = a and the Fourier series for a square wave. 


14.3.7 A metal cylinder is placed in a (previously) uniform electric field, £ 0 , the axis 
of the cylinder perpendicular to that of the original field. 

(a) Find the perturbed electrostatic potential. 

(b) Find the induced surface charge on the cylinder as a function of angular 
position. 

14.3.8 Transform the Fourier expansion of a square wave, Eq. 14.3.6, into a power 
series. Show that the coefficients of x 1 form a divergent series. Repeat for the 
coefficients of x 3 . 

A power series cannot handle a discontinuity. These infinite coefficients are 
the result of attempting to beat this basic limitation on power series. 


1 4.3.9 (a) Show that the Fourier expansion of cos ax is 

2a sin an { 1 


a* = (-1)" 


2a 2 a 2 


la sin an 


cos x , cos 2x 

_l - — — - 


\ 2 


n(a 2 - n 2 ) 

(b) From the preceding result show that 

00 

anvotan = 1 — 2 ]T C(2 p)a 2p . 

p = i 

This provides an alternate derivation of the relation between the Riemann 
zeta function and the Bernoulli numbers, Eq. 5.151. 


14.3.10 Derive the Fourier series expansion of the Dirac delta function <5(x) in the 
interval —n<x<n. 

(a) What significance can be attached to the constant term? 

(b) In what region is this representation valid? 

(c) With the identity 
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sin(JVx/2) / 1\ n 

Z cos nx = --r---- COS N+- x/2 , 

*=i sin(x/2) LV 2 / J 

show that your Fourier representation of d(x) is consistent with Eq. 8.83d. 

14.3.11 Expand S(x — t) in a Fourier series. Compare your result with the bilinear 
form of Eq. 9.83. 

1 1 00 

ANS. d(x — t) = 1 — y (cos hx cos + sm nx sin nt) 

2n 7i n =i 

1 1 00 

— f- - V cos n ( x _ 

2n it „ _ , 


14.3.12 Verify that 


- </> 2 ) = T z 

/,7l — oo 


is a Dirac delta function by showing that it satisfies the definition of a Dirac 
delta function: 


/(<?>);- Z , = f(<p 2 h 

Zjl m- -oc 


Hint. Represent f((p { ) by an exponential Fourier series. 

Note. The continuum analog of this expression is developed in Section 15.2. 
The most important application of this expression is in the determination of 
Green’s functions, Section 16.6. 


14.3.13 (a) Using 


show that 


f(x) = X 2 , — 7T < X < 7T, 


V (- 1 )" 41 _ rc 2 _ ... 

„?i « 2 12" n 

(b) Using the Fourier series for a triangular wave developed in Exercise 14.3.4, 
show that 

00 t IT 2 


(c) Using 


show that 


(d) Using 


/(x) = x 4 , —n<x<n. 


« 4 90 ^ ’ 

n 4 720 

jx(7T — X), 0 < X < 7 I, 

(X(7T + X), — 71 < X < 0, 
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derive 


ct x 8 £ sinnx 

fix) = - £ 

71 „ = 1 


and show that 


' n = l,3,5,... 

odd 


n 


£ ( _i r m„-3 = i_J 3+ 1 1 

n = 1,3.5,... J ' 

odd 


32 


= 0(3). 


(e) Using the Fourier series for a square wave, show that 


n = l , 3, 5 ... 
odd 


111 7T 

3 + 5 _ 7 + "' = 4 = m 


This is Leibnitz’s formula for 7i, obtained by a different technique in Exercise 
5.7.6. 

Note. The 7/(2), t/( 4), 1(2), 0(1), and 0(3) functions are defined by the indi- 
cated series. General definitions appear in Section 5.9. 


14.3.14 (a) Find the Fourier series representation of 

fO, — n < x < 0 

(b) From your Fourier expansion show that 


n 2 1 1 

T =1+ ¥ + ¥ + 


14.3.1 5 Let f(z) = ln(l + z)~ ( — \) n+l z n /n. (This series converges to ln(l + z) for 

\z\ < 1, except at the point z — — 1.) 

(a) From the imaginary parts show that 


vn+r cos nO 

5 

„ = 1 n 

(b) Using a change of variable, transform part (a) into 


ln(2cos|j= £(-!)" 


— it <0 <n. 


-In [ 2 sin 


= 1 

«=i 


cos n<p 
n 


0 < cp < 2 k. 


1 4.3.1 6 A symmetric triangular pulse of adjustable height and width is described by 

(a( 1 — x/b\ 0 < \x\ < b 


f(x) = 


0, 


b < |.x| < 71. 


(a) Show that the Fourier coefficients are 

a 0 = — a„ = ——(1 — cos nb)/{nby. 
n n 

Sum the finite Fourier series through n = 10 and through n ~ 100 for 
x/k — 0(1/9) 1. Take a == 1 and b — n/2. 

(b) Call a Fourier analysis subroutine (if available) to calculate the Fourier 
coefficients of J\x), a 0 through a 10 . 


14.3.17 (a) 


Using a Fourier analysis subroutine, calculate the Fourier cosine coeffi- 
cients a 0 through a 10 of 

fix) = [1 - (x/7l) 2 ] 1/2 , [-7l,7l]. 
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(b) Spot check by calculating some of the preceding coefficients by direct 
numerical quadrature. 

Check values. a 0 = 0.785, a 2 = 0.284. 

1 4 . 3 . 1 8 Using a Fourier analysis subroutine, calculate the Fourier coefficients through 
a l0 and b 10 for 

(a) a full-wave rectifier. Example 14.3.2, 

(b) a half-wave rectifier, Exercise 14.3.1. Check your results against the analytic 
forms given (Eq. 14.39 and Exercise 14.3.1). 


14.4 PROPERTIES OF FOURIER SERIES 


Convergence 

It might be noted, first, that our Fourier series should not be expected to be 
uniformly convergent if it represents a discontinuous function. A uniformly 
convergent series of continuous functions (sin nx , cos nx) always yields a con- 
tinuous function (compare Section 5.5). If, however, (a) f(x) is continuous, 
— n<x<n, (b) /( — n) = /( + 71), and (c) f'(x) is sectionally continuous, the 
Fourier series for J\x) will converge uniformly. These restrictions do not demand 
that j\x) be periodic, but they will be satisfied by continuous, differentiable, 
periodic functions (period of 2n). For a proof of uniform convergence the 
reader is referred to the literature. 1 With or without a discontinuity in /(x), 
the Fourier series will yield convergence in the mean. Section 9.4. 


Integration 

Term-by-term integration of the series 


fix) = -f + X a n cos nx + X l>„sinnx 

^ n = ] n — 1 


yields 


j\x)dx = -f 


+ y — sin nx 
„ - 1 n 


- X 


- n cos nx 
n 


(14.51) 


(14.52) 


Clearly, the effect of integration is to place an additional power of n in the 
denominator of each coefficient. This results in more rapid convergence than 
before. Consequently, a convergent Fourier series may always be integrated 
term by term, the resulting series converging uniformly to the integral of the 
original function. Indeed, term-by-term integration may be valid even if the 
original series (Eq. 14.51) is not itself convergent! The function /(x) need only 
be integrable. A discussion will be found in Jeffreys and Jeffreys, Section 14.06. 

Strictly speaking, Eq. 14.52 may not be a Fourier series; that is, if a 0 ± 0, 
there will be a term 2 a o x - However, 


1 See, for instance, R. V. Churchill, Fourier Series and Boundary Value Prob- 
lems. New York: McGraw-Hill (1941), Section 38. 
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f(x)dx — \a 0 x (15.53) 

Jx 0 

will still be a Fourier series. 


Differentiation 

The situation regarding differentiation is quite different from that of integra- 
tion. Here the word is caution . Consider the series for 

f(x) = x, — n<x<n. (14.54) 

We readily find (compare Exercise 14.3.2) that the Fourier series is 


^ i \n + 1 sin nx , 1 A c c\ 

x = 22 J (”l) > — 7t < x < 7i. (14.55) 

n = l H 

Differentiating term by term, we obtain 


1 =2 Y, ( — 1)" +1 cos nx, (14.56) 

which is not convergent ! Warning . Check your derivative. 

For a triangular wave (Exercise 14.3.4), in which the convergence is more 
rapid (and uniform). 



4 y cos nx 

n n = l, odd n 2 


Differentiating term by term 


N 4 £ sinux 

/w = - i 


n , n 

n = l ,odd 


which is the Fourier expansion of a square wave 


/'(*) - 



0 < X < 71, 
— 71 < X < 0. 


(14.57) 


(14.58) 


(14.59) 


Inspection of Fig. 14.7 verifies that this is indeed the derivative of our triangular 
wave. 

As the inverse of integration, the operation of differentiation has placed an 
additional factor n in the numerator of each term. This reduces the rate of 
convergence and may, as in the first case mentioned, render the differentiated 
series divergent. 

In general, term-by-term differentiation is permissible under the same condi- 
tions listed for uniform convergence. 


EXERCISES 

1 4.4.1 Show that integration of the Fourier expansion of f(x ) = x, - n < x < n, leads 
to 
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14.4.2 


12 


= K-i r +t n~ 


-1 _i Li_l, ... 

“ 1 4- ' 9 J 6 ' 

Parseval’s identity. 

(a) Assuming that the Fourier expansion of f(x) is uniformly convergent, show 
that 


-f [/(x)] 2 dx = f + £ (at + bi). 

n J-n 2 n - 1 

This is Parseval’s identity. It is actually a special case of the completeness 
relation, Eq. 9.72. 

(b) Given 



( — 1)" cos nx 


— n < x < 7i, 


apply Parseval’s identity to obtain £(4) in dosed form. 

(c) The condition of uniform convergence is not necessary. Show this by 
applying the Parseval identity to the square wave 


/(x) = 



— n < x < 0 
0 < x < n 


4 y sin(2« — \)x 
2n-l 


14.4.3 Show that integrating the Fourier expansion of the Dirac delta function 
(Exercise 14.3.10) leads to the Fourier representation of the square wave, Eq. 
14.3.6, with h = 1. 

Note. Integrating the constant term (l/27t) leads to a term x/2n. What are you 
going to do with this? 


1 4.4.3A Integrate the Fourier expansion of the unit step function 

/w={ 0 ’ -*<* <0 

(x, 0 < X < 71. 

Show that your integrated series agrees with Exercise 14.3.14. 


14.4.4 


In the interval ( — n, n). 


S n {x) = n. 


0, 


for Ixl < — , 

1 1 2 n 

for ixl > ~ 

1 ! 2 n 


(Fig. 14.9). 

(a) Expands <5„(x) as a Fourier cosine series. 

(b) Show that your Fourier series agrees with a Fourier expansion of 5(x) in 
the limit as n oo. 


14.4.5 


Confirm the delta function nature of your Fourier series of Exercise 14.4.4 by 
showing that for any f(x) that is finite in the interval [ — 7r, 7r] and continuous 
at x = 0, 


L 


/(x) [Fourier 


expansion of ^^(x)] dx = /( 0). 
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14.4.6 (a) Show that the Dirac delta function S(x — a), expanded in a Fourier sine 

series in the half interval (0, L), (0 < a < L), is given by 

c/ , 2 £ . (nna\ . (nnx\ 

irr\r) 

Note that this series actually describes 

— <5(x + a) + <5(x — a) in the interval ( — L,L). 

(b) By integrating both sides of the preceding equation from 0 to x, show that 
the cosine expansion of the square wave 

= °~ x<a 
fl, a < x < L, 


/(*) = - i 1 sin ( 

« -=i « ' 

(c) Verify that the term 


Z, 1 . 

) -sin 

n “ n 


nna \ / nnx 

cos 


E _s in( “ 7 “ ) is </W> 


0 < x < L. 


14.4.7 Verify the Fourier cosine expansion of the square wave, Exercise 14.4.6(b), by 
direct calculation of the Fourier coefficients. 


1 4.4.8 (a) A string is clamped at both ends x — 0 and x = L. Assuming small ampli- 

tude vibrations, we find that the amplitude y(x, t ) satisfies the wave equation 

d 2 y _ 1 d 2 y 
dx 2 v 2 dt 2 

Here v is the wave velocity. The string is set in vibration by a sharp blow 
at x = a. Hence we have 

y(-x, 0) = 0 

^ = Lv 0 d(x — a) at t = 0. 
ct 


The constant L is included to compensate for the dimensions (inverse 
length) of c)(x — a). With <5(x — a) given by Exercise 14.4.6(a), solve the 
wave equation subject to these initial conditions. 


ANS. y{x , t) 


2 v a L ^ 1 . nna . nnx . nnvt 

— — > -sin— --sin sin-—-. 

nv ^ n L L L 
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14.4.9 


14.4.10 


(b) Show that the transverse velocity of the string — ^ is given by 


dt 


dy(x, t ) 
dt 


» vp . nna . nnx nuui 

= 2v q > sin sin cos . 

ii L L L 


nnvt 


A string, clamped at x = 0 and at x = /, is vibrating freely. Its motion is described 
by the wave equation 

d 2 u(x, t) _ 2 ^ 2 w(x, t) 

~~di r ~~ v ~d^' 

Assume a Fourier expansion of the form 

/ x nnx 

U{x,t) = 2, b„(t) sin — 

n-1 1 

and determine the coefficients b n (t). The initial conditions are 

w(x, 0) = /(x) and ^-u(x,0) = <y(x). 

dt 

Note. This is only half the conventional Fourier orthogonality integral interval. 
However, as long as only the sines are included here, the Sturm-Liouville 
boundary conditions are still satisfied and the functions are orthogonal. 


,. T , , , . . nnvt _ . nnvt 

AN S. b n (t ) — A n cos — — I- B n sin — 


K = T f( x ) sin —r~ dx, B n = 


/ 




, x . nnx , 
g(x)$m—j~ax. 


(a) 


Continuing the vibrating string problem, Exercise 14.4.9, the presence 
of a resisting medium will damp the vibrations according to the equation 

d 2 u(x, t) _ 2 d 2 u(x, 0 , du(x, t) 

a ? ~ v sT' 


Assume a Fourier expansion 


u(x,t) = £ 6„(f)sim 


and again determine the coefficients b n (t). Take the initial and boundary 
conditions to be the same as in Exercise 14.4.9. Assume the damping to 
be small. 

(b) Repeat but assume the damping to be large. 

ANS. (a) b n (t) = e~ kt/1 {A„ cos co n t + B n sin 


(b) 


A„ = - 


/(x) sin dx , 


Jo 


. nnx , Ac 2 

,y|x)sin ---dx + — A„, oj„ 
l 2(x) n 


b n {t ) = e ktl2 {A n cosho n t + 5„sinh<r n t}, 


A„ = : 


' . . . rcrcx , 
7(x) sm -dx, 


r 

Jo 


, . . rax , k . ? 

^(x)sin— —dx + — -i4 n ,o- n < 
/ 2cr„ 



> 0 . 



> 0 . 
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14.4.11 Find the charge distribution over the interior surfaces of the semicircles of 
Exercise 14.3.6. 

Note. You obtain a divergent series and this Fourier approach fails. Using 
conformal mapping techniques, we may show the charge density to be pro- 
portional to esc 0. Does esc 0 have a Fourier expansion? 


14.4.12 Given 


(pi(x) = £ 


sin nx 


-~(n + x), 


1 


(tt-x) 


show by integrating that 


, , v- cosjtx 

<p 2 (*) = L — — = < 


r (n + x) 2 

7l 2 

4 

~ 12’ 

(n - x) 2 

n 2 

4 

~~ 12 


— 71 < X < 0 
0 < X < 7T, 

— 71 < X < 0 
0 < X < 71. 


14.4.13 Given 


ifriM) = L 


n 2 ' 


fetiW= X 


Develop the following recurrence relations : 

(a) i l'2s(x)= ^ 2s _,(x)rfx 

Jo 

(b) fe +1 (x) - f(2s FI)- V 2s W^x. 

Note. These functions i/^„(x) and the <p„(x) of the preceding exercise are known 
as Clausen functions. In theory they may be used to improve the rate of con- 
vergence of a Fourier series. As with the series of Chapter 5, there is always the 
question of how much analytical work we do and how much arithmetic work 
we demand that the computing machine do. As machines become steadily 
more powerful, the balance progressively shifts so that we are doing less and 
demanding that the machines do more. 


14.4.14 Show that 


may be written as 


f(x) = v 
1 


cos nx 
n + 1 


fix) = \piix) - (p 2 (x) + 


I 


cos nx 
n 2 {n + 1) 


Note, ^i(x) and cp 2 (x) are defined in the preceding exercises. 


14.5 GIBBS PHENOMENON 

The Gibbs phenomenon is an overshoot, a peculiarity of the Fourier series 
and other eigenfunction series at a simple discontinuity. An example is seen in 
Fig. 14.1. 



784 FOURIER SERIES 


Summation of Series 

In Section 14.1 the sum of the first several terms of the Fourier series for a 
sawtooth wave was plotted (Fig. 14.1). Now we develop an analytic method of 
summing the first r terms of our Fourier series. 

From Eq. 14.13 

i r 

a n cosnx + b n sinnx ~ ~ j(t)cosn(t — x)dt. (14.60) 

J 7C 

Then the rth partial sum becomes 1 


n = 0 


S r (x) = Y* ( a n COS nx + K s ^ n nx ) 
1 


(14.61) 


1 f 

- M— 

71 

J 7t 


m 


+ 1 


i(t~x)n 


n = l 


dt. 


Summing the finite series of exponentials (geometric progression), 2 we obtain 

1 


s r (x) = 


2n 


mr + i»-x) dt 
sm^(f-x) 


(14.62) 


This is convergent at all points, including t = x. The factor 


(2 n) 1 sin[(r + j)(r - x)] 
sin \{t — x) 

is the Dirichlet kernel mentioned in Section 8.7 as a Dirac delta distribution. 


Square Wave 

For convenience of numerical calculation we consider the behavior of the 
Fourier series that represents the periodic square wave 


m = < 


h 

r 

h 

2’ 


0 < X < 71, 
— 71 < X < 0. 


(14.63) 


This is essentially the square wave used in Section 14.3, and we see immediately 
that the solution is 

r , , 2/z /sin x , sin3x sin5x \ ...... 

/w 'T(-r + — + — + ' -J- (,4<>4 » 

Applying Eq. 14.62 to our square wave (Eq. 14.63), we have the sum of the 
first r terms (plus which is zero here). 


1 It is of some interest to note that this series also occurs in the analysis of the 
diffraction grating (r slits). 

2 Compare Exercise 6.1.7 with initial value n ~ 1. 
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sin(r + j)(f - x) , 

sin^(t — x) 


h f° sin(/~ + ])( ; - x ) d{ 
47T sin^(r-x) 


JL r sin(r + i)(f - x) dt _ f * sin(r + j)(f + x) ^ 

4 n J 0 sin ^(f - x) 4 n J 0 sin^(f + x) 


This last result follows from the transformation 


(14.65) 


t — —t in the second integral. 


Replacing t — x in the first term with 5 and t + x in the second term with s, 
we obtain 



P x sin(r + ^)s 

J-, sin i s 



'Tt + X 

J X 


sin(r + {)s 
sin 2 s 


(14.66) 




FIG. 14.10 Intervals of integration — Eq. 14.66 


The intervals of integration are shown in Fig. 14.10 (top). Because the inte- 
grands have the same mathematical form, the integrals for x to n — x cancel 
leaving the integral ranges shown in the bottom portion of Fig. 14.10. 



'* sin(r + lr).s ^ h 
_ x sin^s 4n 


' n+x sin(r + y).s 

sin l s 


(Is. 


(14.67) 


Consider the partial sum in the vicinity of the discontinuity at x = 0. As 
x -> 0, the second integral becomes negligible, and we associate the first integral 
with the discontinuity at x = 0. Using (r 4- - 2 ) — p and ps = £, we obtain 


h px sin £ 

2n J 0 sin(£/2p) p ' 


(14.68) 


Calculation of Overshoot 

Our partial sum, s r (x), starts at zero when x = 0 (in agreement with Eq. 14.16) 
and increases until £ = ps = 71 , at which point the numerator, sin & goes negative. 
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For large r, and therefore for large p , our denominator remains positive. We get 
the maximum value of the partial sum by taking the upper limit px = n. Right 
here we see that x, the location of the overshoot maximum, is inversely propor- 
tional to the number of terms taken 


n 

x = — 


P 


n 
r * 


The maximum value of the partial sum is then 

sin £dc, 


s ( x ) — - • A 

VVmax 


2 n J 0 sin(f/2 p)p 


hi 
2 n 


sin£ 


dt. 


In terms of the sine integral, si(x) of Section 10.5, 

c sin£ n , x 

~£~ d Z = 2 + 

The integral is clearly greater than nj 2, since it can be written as 


’ 3 7Z 


5k 


3 7T 


sin^ 




r sin Z 


dZ. 


(14.69) 


(14.70) 


(14.71) 


We saw in Section 7.2 that the integral from 0 to oo is tt/ 2. From this integral 
we are subtracting a series of negative terms. A Gaussian quadrature (Appendix 
2) or a power-series expansion and term-by-term integration yields 


sin Z 


dZ = 1.1789797 . . 


(14.72) 


which means that the Fourier series tends to overshoot the positive corner by 
some 18 percent and to undershoot the negative corner by the same amount, 
as suggested in Fig. 14.11. The inclusion of more terms (increasing r ) does 
nothing to remove this overshoot but merely moves it closer to the point of 
discontinuity. The overshoot is the Gibbs phenomenon, and because of it the 
Fourier series representation may be highly unreliable for precise numerical 
work, especially in the vicinity of a discontinuity. 

The Gibbs phenomenon is not limited to the Fourier series. It occurs with 
other eigenfunction expansions. Exercise 12.3.27 is an example of the Gibbs 
phenomenon for a Legendre series. 


EXERCISES 

14.5.1 With the partial sum summation techniques of this section, show that at a 
discontinuity in f(x ) the Fourier series for f(x) takes on the arithmetic mean of 
the right- and left-hand limits: 

f(x o) = i[/(x 0 + ) + /(*<>-)]■ 
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In evaluating lim s r (x 0 ) you may find it convenient to identify part of the integrand 

r- *oo 

as a Dirac delta function. 

14 . 5.2 Determine the partial sum, s n , of the series in Eq. 14.64 by using 

/ , sinmx f x , 

(a) = cos my ay 

m Jo 

and 

(b) X cos (2p - l)j = S ‘ n2nJ ' - 

p % 2 sin y 

Do you agree with the result given in Eq. 14.68? 

14 . 5.3 Evaluate the finite step function series, Eq. 14.64, h = 2., using 100, 200, 300, 
400, and 500 terms for * = 0.0000(0.0005)0.0200. Sketch your results (five curves) 
or if a plotting routine is available, plot your results. 


1 4 . 5.4 (a) Calculate the value of the Gibbs’s phenomenon integral 



*71 

0 


sinr , 

dt 

t 


by numerical quadrature accurate to 12 significant figures. 

(b) Check your result by (1) expanding the integrand as a series, (2) integrating 
term by term, and (3) evaluating the integrated series. This calls for double 
precision calculation. 

ANS. I — 1.1789 7974 4472. 


14.6 DISCRETE ORTHOGONALITY— DISCRETE 
FOURIER TRANSFORM 

For many physicists the Fourier transform is automatically the continuous 
Fourier transform of Chapter 15. The use of the electron digital computer, 
however, necessarily replaces a continuum of values by a discrete set; an inte- 
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gration is replaced by a summation. The continuous Fourier transform becomes 
the discrete Fourier transform and an appropriate topic for this chapter. 


Orthogonality Over Discrete Points 

The orthogonality of the trigonometric functions and the imaginary expo- 
nentials is expressed in Eqs. 14.7 to 14.10. This is the usual orthogonality for 
functions: integration of a product of functions over the orthogonality interval. 
The sines, cosines, and imaginary exponentials have the remarkable property 
that they are also orthogonal over a series of discrete, equally spaced points 
over the period (the orthogonality interval). 

Consider a set of 2N time values 


h = 0 , 


T 2 T 
21V’ 21V’ * 


for the time interval (0, T). Then 


= kT 
tk ~ 2 N’ 


k = 0, 


(2 N - 1)7 
2N 

1, 2, . . . , 2N - 1. 


(14.73) 

(14.74) 


We shall prove that the exponential functions txp(2nipt k /T) and exp(2niqt k /T) 
satisfy an orthogonality relation over the discrete points t k : 


2JV-1 

Z lexp(2nipt k /T)]*expi2niqtJT) = 2NS pq±2 „ N . (14.75) 

k = 0 

Here n, p, and q are all integers. 

Replacing q — p by s, we find that the left-hand side of Eq. 14.75 becomes 


2N - 1 2N-1 

Y, exp(2nist k /T) = Y ^p{2nisk/2N). 

k=0 k=0 

This right-hand side is obtained by using Eq. 14.74 to replace T. This is a finite 
geometric series with an initial term 1 and a ratio 

r = exp(nis/N). 

From Eq. 5.7 

2N-1 

Z exp(2 nist k /T) = 

k = 0 

establishing Eq. 14.75, our basic orthogonality relation. The upper value, zero, 
is a consequence of 

r 2N = exp (27ns) = 1 

for s an integer. The lower value, 2 N, for r = 1 corresponds to p = q. 

The orthogonality of the corresponding trigonometric functions is left as 
Exercise 14.6.1. 


1 


1 — r 
21V, 


= 0 , 


r =£ 1 
r = 1, 


(14.76) 
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Discrete Fourier Transform 

To simplify the notation slightly and to make more direct contact with 
physics, we introduce the (reciprocal) co-space, angular frequency, with 

co p = 2 t ip/T, p = 0, 1, 2, . . . , 2N - 1. (14.77) 

We make p range over the same integers as k. The exponential Qxp(±2nipt k /T) 
of Eq. 14.75 becomes exp(±ico p t k ). The choice of whether to use the + or the — 
sign is a matter of convenience or convention. In quantum mechanics the 
negative sign is selected when expressing the time dependence. 

Consider a function of time defined (measured) at the discrete time values t. k . 
We may construct 

i 2N-1 

fK) = i Z f(t k )e‘^K (14.78) 

ZiV fe = 0 

Employing the orthogonality relation, we obtain 

i 2N - 1 

__L £ (^Vm)V'Vk = <5 mk , (14.78a) 

2jV p = 0 

and then replacing subscript m by /c, we find that the amplitudes, f(t k ), become 

2N-1 

f(t k ) = Z F(co p )e-™p'K (14.79) 

P~ 0 

The time function f(t k ), k = 0, 1, 2, . . . , 2 N — 1, and the frequency function 
F( aj p \ p = 0, 1, 2, . . . , 2iV — 1, are discrete Fourier transforms of each other. 1 
Compare Eqs. 14.78 and 14.79 with the corresponding continuous Fourier 
transforms Eqs. 15.22 and 15.23 of Chapter 15. 

Limitations 

Taken as a pair of mathematical relations, the discrete Fourier transforms 
are exact. We can say that the 2 N 2 N component vectors exp( — ito p t k ), k — 0, 
1, 2, . . ., 2 N — 1, form a complete set 2 spanning the ^-space. Then f(t k ) in 
Eq. 14.79 is simply a particular linear combination of these vectors. Alter- 
natively, we may take the 2 N measured components f(t k ) as defining a 2 N 
component vector in t k - space. Then, Eq. 14.78 yields the 2 N component vector 
F(co p ) in the reciprocal cOp-space. Equations 14.78 and 14.79 become matrix 
equations with exp(icOpt k )/(2iV) 1/2 the elements of a unitary matrix. 

The limitations of the discrete Fourier transform arise when we apply 
Eqs. 14.78 and 14.79 to physical systems and attempt physical interpretation 
and the generalization F(a> p ) F(co). Example 14.6.1 illustrates the problem that 
can occur. The most important precaution to be taken to avoid trouble is to 


lr The two transform equations may be symmetrized with a resulting (2A0 - 1/2 
in each equation if desired. 

2 By Eq. 14.76 these vectors are orthogonal and are therefore linearly 
independent. 
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take N sufficiently large so that there is no angular frequency component of a 
higher angular frequency than co N = 2nN/T. For details on errors and limita- 
tions in the use of the discrete Fourier transform the reader is referred to 
Bergland and Hamming. 

EXAMPLE 14.6.1 Discrete Fourier Transform — Aliasing 
Consider the relatively simple case of T = 2n, N = 2, and j\t k ) = cos t k . 
From 

t k = kT/4 = kn/2, k - 0, 1, 2, 3 (14.80) 

f(t k ) = cos (t k ) is represented by the four-component vector 

m = (1,0, -1,0). (14.81) 

The frequencies, co p are given by Eq. 14.77: 

c o p = 2np/T = p. (14.82) 

Clearly, cos t k implies a p = 1 component and no other frequency components. 
The transformation matrix 

(2 Ny 1 exp (ico p t k ) = (2JV)"" 1 cxp(ipkn/2) 

becomes 



Note that the 2 N x 2 N matrix has only 2 N independent components. It is the 
repetition of values that makes the fast Fourier transform technique possible. 

Operating on column vector f(t k ) 9 we find that this matrix yields a column 
vector 

F( a> p ) = (0,i,0,i). (14.84) 

Apparently, there is a p = 3 frequency component present. We reconstruct f(t k ) 
by Eq. 14.79, obtaining 

f(h) = ie~ i,k + ie~ 3l ' k - (14.85) 

Taking real parts, we can rewrite the equation as 

f(t k ) = { cos t k + \ cos 3r*. (14.86) 

Obviously, this result, Eq. 14.86, is not identical with our original / ( 4 ) = cos 4 . 
But cos t k = j cos t k + ^cos 3 t k at t k = 0, 7t/2 , 7 1 ; and 3n/2. The cos t k and cos 3 t k 
mimic each other because of the limited number of data points (and the partic- 
ular choice of data points). This error of one frequency mimicking another is 
known as aliasing. The problem can be minimized by taking more data points. 
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Fast Fourier Transform 

The fast Fourier transform is a particular way of factoring and rearranging 
the terms in the sums of the discrete Fourier transform. Brought to the attention 
of the scientific community by Cooley and Tukey, 3 its importance lies in the 
drastic reduction in the number of numerical operations required. Because of 
the tremendous increase in speed achieved (and reduction in cost), the fast 
Fourier transform has been hailed as one of the few really significant advances 
in numerical analysis in the past few decades. 

For N time values (measurements) a direct calculation of a discrete Fourier 
transform would mean about N 2 multiplications. For N a power of 2 the fast 
Fourier transform technique of Cooley and Tukey cuts the number of multi- 
plications required to (iV/2)log 2 N. If N = 1024 (= 2 10 ), the fast Fourier trans- 
form achieves a computational reduction by a factor of over 200. This is why 
the fast Fourier transform is called fast and why it has literally revolutionized 
the digital processing of waveforms. 

The fast Fourier transform should be available at every computation center. 
It is included in the SSP. Details on the internal operation will be found in the 
paper by Cooley and Tukey and in the paper by Bergland 4 * 


EXERCISES 

14.6.1 Derive the trigonometric forms of discrete orthogonality corresponding to Eq. 
14.75: 


2 TV — 1 

Y cos(2npt k /T)sin(2nqt k /T) = 0 

k=0 


2JV — 1 ! 

0. 

p + q 

Y cos(2npt k /T)cos(2nqtJT) = j 

N, 

p = q f 0, N 

j 

2 N, 

p = q = 0, N 

2iV — 1 

(°, 


Y sin(2nptJT)sin{2nqt k /T) = < 


p = q ± 0, N 

fc = 0 

1.0, 

p = q — 0, N. 


Hint. Trigonometric identities such as 

sin A cos B = ^[sin(T + B) + sin(4 — B)] 

are useful. 

14.6.2 Equation 14.75 exhibits orthogonality summing over time points. Show that we 
have the same orthogonality summing over frequency points. 

i 2N-1 

£ (e‘°>p>m)* e lm p'k = S mk . 

p = 0 


3 J. W. Cooley and J. W. Tukey, Math. Computation 19, 297 (1965). 

4 G. D. Bergland, A Guided Tour of the Fast Fourier Transform , IEEE Spec- 

trum, pp. 41-52 (July 1969). 
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1 4.6.3 Show, in detail, how to go from 

i 2N~\- 

F(a> p ) = — — X At k )e‘W 

^ Jc = 0 


to 

2JV— 1 

fih) = I F(oJ l ,)e~ w ‘r'i‘. 
p - 0 


14.6.4 The functions /(t fc ) and F(co p ) are discrete Fourier transforms of each other. 
Derive the following symmetry relations: 

(a) If f(t k ) is real, F{co p ) is Hermitian symmetric; that is, 


F(oj p ) = F* 



(b) If f(t k ) is pure imaginary, 


F(oj p ) = 



Note. The symmetry of part (a) is an illustration of aliasing. The frequency 
4nN/T — a) p masquerades as the frequency a) p . 


14.6.5 Given N = 2, T = 2n, and f(t k ) = sin t k . 

(a) Find F(co p ), p — 0, 1, 2, 3. 

(b) Reconstruct f(t k ) from F(oj p ) and exhibit the aliasing of cd 1 = 1 and co 3 = 3. 

ANS. (a) F(a) p ) = (0,i/Z0,-i/2) 
(b) /Uk) = 2 sin r k — 5 sin 3t k . 


14.6.6 Show that the Chebyshev polynomials satisfy a discrete orthogonality 

relation 




N - 1 

I 


s=l 


T m (x s )T„(x s ) + \TJ\)T„(\) 


I °’ 

[N/2 


m =/= n 
m = n 0 
m = n = 0 . 


Here = cos 0 S9 where the (N + l)^ s ’s are equally spaced along the 0 axis: 


0 S = — , s = 0 , 1 , 2 , . . . , N. 
N 
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15 INTEGRAL 
TRANSFORMS 


15.1 INTEGRAL TRANSFORMS 

Frequently in mathematical physics we encounter pairs of functions related 
by an expression of the following form: 

*-jW 05. d 

The function g( a) is called the (integral) transform of f(t) by the kernel K( a, t). 
The operation may also be described as mapping a function f(t) in f-space into 
another function g( a) in a-space. This interpretation takes on physical signifi- 
cance in the time-frequency relation of Example 15.3.1 and in the real space- 
momentum space relations of Section 15.6. 


Fourier Transform 

One of the most useful of the infinite number of possible transforms is the 
Fourier transform given by 

g(a) = -y]= J* f(t)e iM dt. (15.2) 

Two modifications of this form, developed in Section 15.3, are the Fourier 
cosine and Fourier sine transforms: 


qM = 

•> 

F- 

*00 

f(t) cos at dt. 

(15.3) 


0 


9 s ( a ) = 

> 

F 1 

"*00 

f(t) sin at dt . 

(15.4) 

V 71 J 

0 



The Fourier transform is based on the kernel e m and its real and imaginary parts 
taken separately, cos at and sin at . Because these kernels are the functions used 
to describe waves, Fourier transforms appear frequently in studies of waves and 
the extraction of information from waves, particularly when phase information 
is involved. The output of a stellar interferometer, for instance, involves a Fourier 
transform of the brightness across a stellar disk. The electron distribution in an 
atom may be obtained from a Fourier transform of the amplitude of scattered 
X-rays. In quantum mechanics the physical origin of the Fourier relations of 


794 
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Section 15.6js the wave nature of matter and our description of matter in terms 
of waves. 


Laplace, Mellin, and Hankel Transforms 

Three other useful kernels are 

e~ a \ tJ n ((xt\ t*~ l . 
These give rise to the following transforms 


0 (a) = j 

00 

f(t)e~ M dt, 

0 

Laplace transform 

(15.5) 

0(a) = 

*00 

f(t)tJ„(cct)dt , 

0 

Hankel transform (Fourier-Bessel) 

(15.6) 

0 (a) = 

H m'-'dt. 

Mellin transform. 

(15.7) 


Jo 


Clearly, the possible types are unlimited. These transforms have been useful in 
mathematical analysis and in physical applications. We have actually used the 
Mellin transform without calling it by name; that is, g((x) = (a — 1) ! is the Mellin 
transform of f(t) = e~ l . Of course, we could just as well say g( a) = n !/a" +1 is the 
Laplace transform of/(t) = t n . Of the three, the Laplace transform is by far the 
most used. It is discussed at length in Sections 15.8 to 15.12. The Hankel trans- 
form, a Fourier transform for a Bessel function expansion, represents a limiting 
case of a Fourier-Bessel series. It occurs in potential problems in cylindrical 
coordinates and has been applied extensively in acoustics. 


Linearity 

All these integral transforms are linear; that is, 
[ci/i(£) + c 2 f 2 {t)]K{a.,t)dt 


= cJ l {t)K(a,t)dt+ c 2 f 2 (t)K(a,t)dt , 


f‘ 


cf(t)K(oc,t)dt = c f(t)K(oc,t)dt, 


(15.8) 

(15.9) 


where c x and c 2 are constants and f x (t) and f 2 (t) are functions for which the 
transform operation is defined. 

Representing our linear integral transform by the operator if, we obtain 

g(a) = &M (15.10) 

We expect an inverse operator if -1 exists such that 1 


1 Expectation is not proof, and here proof of existence is complicated because 
we are actually in an w/m/te-dimensional Hilbert space. We shall prove 
existence in the special cases of interest by actual construction. 
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FIG. 15.1 


f(t) = J?- 1 g(a). (15.11) 

For our three Fourier transforms F£~ x is given in Section 15.3. In general, the 
determination of the inverse transform is the main problem in using integral 
transforms. The inverse Laplace transform is discussed in Section 15.12. For 
details of the inverse Hankel and inverse Mellin transforms the reader is referred 
to the references at the end of the chapter. 

Integral transforms have many special physical applications and interpreta- 
tions that are noted in the remainder of this chapter. The most common ap- 
plication is outlined in Fig. 15.1. Perhaps an original problem can be solved only 
with difficulty, if at all, in the original coordinates (space). It often happens that 
the transform of the problem can be solved relatively easily. Then, the inverse 
transform returns the solution from the transform coordinates to the original 
system. Example 15.4.1 and Exercise 15.4.1 illustrate this technique. 


EXERCISES 

1 5.1 .1 The Fourier transforms for a function of two variables are 

f(x,y)e l{l4X+vy) dx dy , 

'*00 {* 

F(u, v)e~ Kux+v »dudv. 

Using f(x,y) = /([x 2 + > ,2 J 1,2 ), show that the zero- order Hankel transforms 
F(p) = J rf(r)J 0 (pr)dr, 

J'(r ) = J pF(p)J 0 {pr)dp, 

are a special case of the Fourier transforms. 

This technique may be generalized to derive the Hankel transforms of order 
v, v = 0, 1, |, ... (compare Sneddon, Fourier Transforms). A more general 

approach, valid for v > — j, is presented in Sneddon’s The Use of Integral 


F(u,v) = — 
2n 


My) 


2n 
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Transforms . It might also be noted that the Hankel transforms of nonintegral 
order v = + ^ reduce to Fourier sine and cosine transforms. 


1 5.1 .2 Assuming the validity of the Hankel transform-inverse transform pair of equa- 
tions 


0(a) 


-r 




fit) = J g(tx)J„(M)adu, 

show that the Dirac delta function has a Bessel integral representation 


<5 (t - n 


(*00 

= M J ’> 

Jo 


(at)J n (at')ada. 


This expression is useful in developing Green’s functions in cylindrical coordi- 
nates, where the eigenfunctions are Bessel functions. 

15.1.3 From the Fourier transforms, Eqs. 15.22 and 15.23, show that the transformation 

t -> In x 

ioj -> oc — y 

leads to 


G( a) = F(x)x a 1 dx 


and 


F(x) = 


2ni 


'y + ioo 


G(a )x~Ua. 


These are the Mellin transforms. A similar change of variables is employed in 
Section 15.12 to derive the inverse Laplace transform. 


1 5.1 .4 Verify the following Mellin transforms: 


(a) 

(b) 


*00 

x a_1 Si 
Jo 


sin(/cx)dx = k a (a — 1)! sin— , 


x a l cos{kx)dx = k “(a — l)!cos— , 


-1 <a< 1 . 

0 < a < 1. 


Hint. You can force the integrals into a tractable form by inserting a convergence 
factor e~ bx and (after integrating) letting b -► 0. Also, cos kx + i sin kx = exp ikx. 


1 5.2 DEVELOPMENT OF THE FOURIER INTEGRAL 

In Chapter 14 it was shown that Fourier series are useful in representing 
certain functions (1) over a limited range [0, 27c], [ — L, L], and so on, or (2) for 
the infinite interval ( — oo, oo), if the function is periodic. We now turn our atten- 
tion to the problem of representing a nonperiodic function over the infinite 
range. Physically this means resolving a single pulse or wave packet into 
sinusoidal waves. 
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We have seen (Section 14.2) that for the interval [ — L,L] the coefficients a n 
and b„ could be written as 


a„ = 


K = 


' n7lt A, 

j(t) cos ——at 

L/ 


, v , . nnt . 
j(t) sm-j- dt. 


The resulting Fourier series is 


1 


1 


fix) = — f(t)dt + — X cos 


nnx 


1 ^ . nnx 

+ z^ sm ~r 

^ n~l ^ 


f 

^ n = 1 
CL 


nnt n 

fit) cos -—dt 


/v , . nut j 
j(t)sm—dt 


(15.12) 

(15.13) 


(15.14) 


or 


fix) = 


2 L 


f(t)dt + — Yj 


-L 


nn . 


f{t) cos — (t — x)dt . 


(15.15) 


We now let the parameter L approach infinity, transforming the finite interval 
[ — L,L] into the infinite interval (— oc, oo). We set 


nn 


= CO, 


= Am, with L -► oo. 


Then we have 


1 00 

/(x) - Y Ac o I f(t) cos co(f — x) dt 

n„=i 


(15.16) 


I /»oo /'OO 

/(x) = - dm /(f)cosm(f — x)df, (15.17) 

n Jo J-oo 

replacing the infinite sum by the integral over co. The first term (corresponding to 
a 0 ) has vanished, assuming that j-oo f(t)dt exists. 

It must be emphasized that this result (Eq. 15.17) is purely formal. It is not 
intended as a rigorous derivation, but it can be made rigorous (compare I. N. 
Sneddon, Fourier Transforms , Section 3.2). We take Eq. 15.17 as the Fourier 
integral. It is subject to the conditions that/(x) is (1) piecewise continuous, (2) 
differentiable, and (3) absolutely integrable— that is, J 00 ^ |/(x)| dx is finite. 


Fourier Integral — Exponential Form 

Our Fourier integral (Eq. 15.17) may be put into exponential form by noting 
that 


/(*) = 


2n 


dco 

- oo */ — 00 


/(f)cosco(f — x)dt, 


(15.18) 
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whereas 


271 



f(t) sin co(t — x)dt — 0; 


(15.19) 


cose o(t — x) is an even function of co and sinco(t — x) is an odd function of t o. 
Adding Eqs. 15.18 and 15.19 (with a factor i), we obtain 

*00 /* 00 

f(t)e i,M dt. (15.20) 

— 00 J — 00 

The variable co introduced here is an arbitrary mathematical variable. In many 
physical problems, however, it corresponds to the angular frequency co. We may 
then interpret Eq. 15.18 or 15.20 as a representation of f(x)i n terms of a distribu- 
tion of infinitely long sinusoidal wave trains of angular frequency co in which this 
frequency is a continuous variable. 



Dirac Delta Function Derivation 

If the order of integration of Eq. 15.20 is reversed, we may rewrite it as 


f(x) = 


fit) 


J ~ 00 



? i<oit - x) da)Ut, 


(15.20 a) 


Apparently the quantity in curly brackets behaves as a delta function — 3(t — x). 
We might take Eq. 15.20a as presenting us with a representation of the Dirac 
delta function. Alternatively, we take it as a clue to a new derivation of the 
Fourier integral theorem. 

From Eq. 8.114 (shifting the singularity from t = 0 to t = x) 


f(x) = lim f(t)d n (t - x)dt , (15.21a) 

n->ao I 

J ~ oo 

where S n (t — x) is a sequence defining the distribution S(t — x). Note that Eq. 
15.21a assumes that f(t) is continuous at t — x. 

We take S n (t — x) to be 

SJt -x) = f = T f e i0 *- x) dw, (15.21 b) 

n(t-x) 2n J_ n 

using Eq. 8.111. Substituting into Eq. 15.21a, we have 

■j i*oo r*n 

f{x) — lim— f(t) e l<o(t ~ x) dcodt. (15.21c) 

h-»oo 2n 

J — oo J ~n 

Interchanging the order of integration and then taking the limit as n -> oo, we 
have Eq. 15.20, the Fourier integral theorem. 

With the understanding that it belongs under an integral sign as in Eq. 15.21a, 
the identification 

<5(t-x) = ^-| e iw(, ~ x >doj, (15.2 Id) 

J - 00 
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provides a very useful representation of the delta function. It is used to great 
advantage in Sections 15.5 and 15.6. 


15.3 FOURIER TRANSFORMS— INVERSION 
THEOREM 


Let us define g{a>), the Fourier transform of the function f(t), by 


g(oj) = 


4= f Me™' dt. 

V 271 J-co 


(15.22) 


Exponential Transform 

Then from Eq. 15.20 we have the inverse relation 


fix) = 



g(co)e Tkox d w. 


(15.23) 


It will be noted that Eqs. 15.22 and 15.23 are almost but not quite symmetrical, 
differing in the sign of i. 

Here two points deserve comment. First, the 1/<J2 n symmetry is a matter of 
choice, not of necessity. Many authors will attach the entire \/2n factor of Eq. 
15.20 to one of the two equations: to Eq. 15.22 or Eq. 15.23. Second, although 
the Fourier integral Eq. 15.20 has received much attention in the mathematics 
literature, we shall be primarily interested in the Fourier transform and its 
inverse. They are the equations with physical significance. 

When we move the Fourier transform pair to three-dimensional space, 
it becomes 


g(k) = d ' X il5 23a) 

fir ) = ^ 3/2 | g(K)e-*« d 3 k. (15.231)) 

The integrals are over all space. Verification, if desired, follows immediately by 
substituting the left-hand side of one equation into the integrand of the other 
equation and using the three-dimensional delta function. 1 Equation 15.23h may 
be interpreted as an expansion of a function /( r) in a continuum of plane wave 
eigenfunctions. g(k) then becomes the amplitude of the wave exp( — ik • r) . 


l S(r 1 - r 2 ) = d(x 1 - x 2 )S(y l - y 2 )S(z l - z 2 ) 


_ J_ f* 

j_ r 
2*J-, 


exp[ ik 1 (x 1 - x 2 y\dk 1 ~ | exp \ik 2 {y 1 ~y 2 )]dk 2 


Qxp\_ik 3 (z 1 — z 2 )~\dk 3 


exp[zk-(fi — r 2 )]d 3 k. 
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Cosine Transform 

If f(x) is odd or even, these transforms may be expressed in a somewhat 
different form. Consider, first, f(x) = /( — x), even. Writing the exponential of 
Eq. 15.22 in trigonometric form, we have 


= 


sjTn 


f c (t)(coscot + i sin o)t)di 


— /— I f c (t) cos cot dt. 


(15.24) 


the sin cot dependence vanishing on integration over the symmetric interval 
(—oo, oo ). Similarly, since cos cot is even, Eq. 15.23 transforms to 


fM) = ~ 

\ n 


g c (o)) cos cox c/co. 


(15.25) 


Equations 15.24 and 15.25 are known as Fourier cosine transforms. 


Sine Transform 

The corresponding pair of Fourier sine transforms is obtained by assuming 
that f(x) = — odd, and applying the same symmetry arguments. The 

equations are 


&(«) = 

F f f s (t) sin wtdt, 2 

(15.26) 

> 



L(x) = 

[2 f 00 

/- a, (co) sin 

(15.27) 

-> 

0 



From the last equation we may develop the physical interpretation that f(x) is 
being described by a continuum of sine waves. The amplitude of sin ojx is given 
by ^[2 7 jn g s (o)\ in which g s (oo) is the Fourier sine transform of/’ (x). It will be seen 
that Eq. 15.27 is the integral analog of the summation (Eq. 14.18). Similar inter- 
pretations hold for the cosine and exponential cases. 

If we take Eqs. 15.22, 15.24, and 15.26 as the direct integral transforms, 
described by J5f in Eq. 15.10 (Section 15. 1), the corresponding inverse transforms, 
if* 1 of Eq. 15.11, are given by Eqs. 15.23, 15.25, and 15.27. 

The reader will note that the Fourier cosine transforms and the Fourier sine 
transforms each involve only positive values (and zero) of the arguments. We 
use the parity of f(x) to establish the transforms, but once the transforms are 
established, the behavior of the functions / and g for negative argument is 
irrelevant. In effect, the transform equations themselves impose a definite 
parity; even for the Fourier cosine transform and odd for the Fourier sine 
transform. 


EXAMPLE 15.3.1 Finite Wave Train 


An important application of the Fourier transform is the resolution of a 
finite pulse into sinusoidal waves. Imagine that an infinite wave train sinco 0 t is 


2 Note that a factor — i has been absorbed into this g(co). 
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clipped by Kerr cell or saturable dye cell shutters so that we have 

Nk 

cV 

Nn 


m = < 


sin co 0 f, 

0 , 


t <■ 


(15.28) 


t > 


co 0 


This corresponds to N cycles of our original wave train (Fig. 15.2). Since f{t ) is 
odd, we may use the Fourier sine transform (Eq. 15.26) to obtain 


m 



1 2 r Nn i °> o 

g s (co) = /— sin aj 0 t sin cot dt. (15.29) 

Integrating, we find our amplitude function 

sin[(cu 0 - co)(jV7r/<x»o)] sin[(ct> 0 + co)(iVrc/a) 0 )] l a53Q , 

2(oj 0 - w) 2(a> 0 + co) J' 1 ' ’ 

It is of some considerable interest to see how g s (a>) depends on frequency. For 
large co 0 and co « co 0 only the first term will be of any importance. It is plotted in 
Fig. 15.3. This is the amplitude curve for the single slit diffraction pattern. There 
are zeroes at 



8(u) 

1 Ntt 



CO — COq 


FIG. 15.3 Fourier transform of finite 
wave train 
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co 0 — CO _ A GO 


oo 0 


co 0 + 


N 9 


± — , and so on. 


(15.31) 


g s (co ) may also be interpreted as a Dirac delta distribution as in Section 8.7. 
Since the contributions outside the central maximum are small, we may take 


Aa> = ~ (15.32) 

as a good measure of the spread in frequency of our wave pulse. Clearly, if N is 
large (a long pulse), the frequency spread will be small. On the other hand, if our 
pulse is clipped short, N small, the frequency distribution will be wider. 


Uncertainty Principle 

Here is a classical analog of the famous uncertainty principle of quantum 
mechanics. If we are dealing with electromagnetic waves, 


hco - 

2i- E - 


energy (of our wave pulse or photon) 


hAco 

2n 


= A E, 


(15.33) 


h being Planck’s constant, which represents an uncertainty in the energy of our 
pulse. There is also an uncertainty in the time, for our wave of N cycles requires 
2Nn/co 0 seconds to pass. Taking 

2Nn 


At = 

o 0 

we have the product of these two uncertainties: 

h A co 2rtN 


(15.34) 


AE-At = 


2 n 


o 0 


= . QJq ,2kN _ . 
2nN co 0 

The Heisenberg uncertainty principle actually states 

AE-At > E, 

4n 


(15.35) 


(15.36) 


and this is clearly satisfied in our example. 


EXERCISES 

1 5 . 3.1 (a) Show that g( — co) = g*{co) is a necessary and sufficient condition for f(x) 

to be real. 

(b) Show that g( — co)= —g*(o) is a necessary and sufficient condition for 
f(x) to be pure imaginary. 

Note . The condition of part (a) is used in the development of the dispersion 
relations of Section 7.3. 
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15.3.2 Let F(co) be the Fourier (exponential) transform of f(x) and G(a>) the Fourier 
transform of g(x) = f(x + a). Show that 


15.3.4 


15.3.5 


G(co) = e~ ia< °F((jo). 


1 5.3.3 The function 


fix) ■■ 


|x| < 1 
\x\ > 1 


is a symmetrical finite step function. 

(a) Find the g c {a>\ Fourier cosine transform of f(x). 

(b) Taking the inverse cosine transform, show that 


fix) = - 

71 


sin co cos cox 


dco. 


(c) From part (b) show that 


sin co cos cox 


CO 



o. 

1*1 

> 1, 

71 

4’ 

|x| 

= 1. 

(f. 

1*1 

< 1. 


dco ~ 


(a) Show that the Fourier sine and cosine transforms of e at are 

, x /2 <J) 

g»ia>) = 

Qcia>) = 


Hint. Each of the transforms can be related to the other by integration by 
parts. 

(b) Show that 


f 

r 


cosmcox , • % ^ ax 

w 1 + a 2 2 

x > 0, 

COS COX' . 71 

2 ? da) = e i,x , 

w 2 F a 2 2 a 

x > 0. 


These results may also be obtained by contour integration (Exercise 7.2.14). 

Find the Fourier transform of the triangular pulse 

,v , _ fw - a\x\l |*| < 1/a, 

R > [0, |x| > 1/a. 

Note. This function provides another delta sequence with h = a and a go. 
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1 5.3.6 We may define a sequence 


S n (x) = 


|x| < l/2n, 
|x| > l/2n. 


(This is Eq. 8.108.) Express <5„pc) as a Fourier integral (via the Fourier integral 
theorem, inverse transform, etc.). Finally, show that we may write 


<H*) = lim SJx) = 

” 00 2n 


x dk. 


1 5.3.7 Using the sequence 


show that 


S n (x) = — exp(-nV), 
yjn 


1 

S(x) = — cT^d/e. 

J-CO 


Note. Remember that <5(x) is defined in terms of its behavior as part of an 
integrand — Section 8.7, especially Eqs. 8.114 and 8.115. 

1 5.3.8 Derive sin and cosine representations of S(t — x) that are comparable to the 
exponential representation, Eq. 15. 2 Id. 


2 f° 

n Jo 

2 r 
*Jo 


ANS. - sin cot sin cox daj 

71 Jo 


cos a)t cos cox dco. 


1 5.3.9 In a resonant cavity an electromagnetic oscillation of frequency to 0 dies out as 

A(t) = A 0 e'^ tl2Q e~ i<a o\ t > 0. 

(Take A(t) = 0 for t < 0.) 

The parameter Q is a measure of the ratio of stored energy to energy loss per 
cycle. Calculate the frequency distribution of the oscillation, a*{aj)a(co\ where 
a(co) is the Fourier transform of A(t). 

Note. The larger Q is, the sharper your resonance line will be. 

ANS. a*(oAa(co) = ^- 1 -- 

2 n {oj - o> 0 ) + (w 0 /2Q) 


2* 


15.3.10 Prove that 

_h 

2ni 


1 dco 


E 0 — /r/2 — hoo [0, 


exp ( — Ttjlh) exp ( — iE 0 t/h), t > 0, 

t < 0. 


This Fourier integral appears in a variety of problems in quantum mechanics: 
WKB barrier penetration, scattering, time-dependent perturbation theory, and 
so on. 

Hint. Try contour integration. 

1 5.3.1 1 Verify that the following are Fourier integral transforms of one another: 

12 1 


(a) 


y/a 2 ~ 


.2’ 


0, 


pc < a, 


pc > a , 


and J 0 (ay\ 
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(b) 0, |x| < a, 

and N 0 (a|_v|), 


(c) /"• and K 0 (a|y|). 

V 2 ,/x 2 + a 2 

(d) Can you suggest why I 0 (ay) is not included in this list? 

Hint . J 0 , Nq, and K 0 may be transformed most easily by using an exponential 
representation, reversing the order of integration, and employing the Dirac 
delta function exponential representation (Section 15.2). These cases can be 
treated equally well as Fourier cosine transforms. 

Note. The K 0 relation appears as a consequence of a Green’s function equation 
in Exercise 16.6.14. 


' yfx 2 — a 2 ’ 


* > 


1 5.3.1 2 A calculation of the magnetic field of a circular current loop in circular cylin- 
drical coordinates leads to the integral 


r® 

Jo 


cos kz k K j (ka) dk. 


Show that this integral is equal to 


na 


2 (z 2 + u 2 ) 3/2 ’ 

Hint. Try differentiating Exercise 15.3.11(c). 


1 5.3.1 3 As an extension of Exercise 15.3.11, show that 


(a) 

(b) 

(c) 


(•CO 

Jo 

(*CO 


J 0 (y)dy = 1 , 


N n (y)dy = 0, 


(*co 

K 0 (y)dy = |. 
0 1 


15.3.14 The Fourier integral, Eq. 15.18, has been held meaningless for f(t) = cosat. 

Show that the Fourier integral can be extended to cover f(t) = cos a t by use of 
the Dirac delta function. 


15.3.15 


Show that 


*00 

sin ka J 0 {kp)dk 
Jo 


(a 2 - p 2 )~ 1/2 , 

0, 


p < a, 
p > a. 


Here a and p are positive. The equation comes from the determination of the 
distribution of charge on an isolated conducting disk, radius a. 

Note that the function on the right has an infinite discontinuity at p = a. 
Note. A Laplace transform approach appears in Exercise 15.10.8. 


15.3.16 


The function fir) has a Fourier exponential transform 


0(k) = 


1 

(2 nf'\ 


f(r)e ik ' T d 3 x = 


1 

(2n) 3/2 k 2 ' 


Determine /( r). 

Hint. Use spherical polar coordinates in k- space. 


1 
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1 5.3.1 7 (a) Calculate the Fourier exponential transform of f(x) = e _a|x| . 

(b) Calculate the inverse transform by employing the calculus of residues 
(Section 7.2). 

15.3.18 Show that the following are Fourier transforms of each other 

(./= r .(*)(i -*T 1/2 , W<i 


i n J n (t) and < \j n 
0 , 


T n (x) is the nth-order Chebyshev polynomial. 

Hint. With T n (cos0) = cos nO, the transform of T„(x)(l — x 2 )~ 1/2 leads to an 
integral representation of J n (t). 

15.3.19 Show that the Fourier exponential transform of 

!"! s ' 

(. 0 , \n\ > 1 

is (2i n /2u)j n {kr). Here P n (p) is a Legendre polynomial and j n (kr) is a spherical 
Bessel function. 

15.3.20 Show that the three-dimensional Fourier exponential transform of a radially 
symmetric function may be rewritten as a Fourier sine transform: 


f(r)e ikt d 3 x = - 


[r/(r)] sin kr dr. 


1 5.3.21 (a) Show that f(x) = x 1/2 is a self -reciprocal under both Fourier cosine and 

sine transforms ; that is, 

f x~ l/1 cos xtdx = f -1/2 


x 1/2 sin xtdx = t 1/2 . 


(b) Use the preceding results to evaluate the Fresnel integrals fo cos(y 2 )dv 
and sin (y 2 )dy. 


15.4 FOURIER TRANSFORM OF DERIVATIVES 

In Section 15.1 Fig. 15.1 outlines the overall technique of using Fourier 
transforms and inverse transforms to solve a problem. Here we take an initial 
step in solving a differential equation — obtaining the Fourier transform of a 
derivative. 

Using the exponential form, we determine that the Fourier transform of 
f(x) is 

0(g>) = — 4 = f(x)e i(OX dx (15.37) 

V 2n J-00 

and for df(x)/dx 

= f" ^-e^dx. (15.38) 

Jin ax 

v J — oo 
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Integrating Eq. 15.38 by parts, we obtain 

piwx 00 ■ 

gi ( w ) = ^=f(x) --%= 

V 2re -oo V 27C 

If/(x) vanishes 1 as x -*■ ± oo, we have 

g 1 (co)= -ia) g(oj)i 


f{x)e iwx dx. 


(15.39) 


(15.40) 


that is, the transform of the derivative is ( — ia>) times the transform of the original 
function. This may readily be generalized to the nth derivative to yield 

gjw) = (-korg(w), (15.41) 

provided all the integrated parts vanish as x -* +oo. This is the power of the 
Fourier transform, the reason it is so useful in solving (partial) differential equa- 
tions. The operation of differentiation has been replaced by a multiplication. 


EXAMPLE 15.4.1 Wave Equation 


This technique may be used to advantage in handling partial differential 
equations. To illustrate the technique let us derive a familiar expression of 
elementary physics. An infinitely long string is vibrating freely. The amplitude 
y of the (small) vibrations satisfies the wave equation 


d 2 y _ 1 d 2 y 
dx 2 v 2 dt 2 


(15.42) 


We shall assume an initial condition 


y(x,0) =/(x). 


(15.43) 


Applying our Fourier transform, which means multiplying by e' ax and inte- 
grating over x, we obtain 



d 2 y(x, t) 

dx 2 


c dx = 



s 2 y(x, t)„« 

dt 2 


dx 


(15.44) 


or 


( — i'a) 2 Y (a, t) 


1 a 2 T(o(,t) 
v 2 dt 2 


(15.45) 


Here we have used 


Y(cc,t) = 



y(x, t)e iax dx 


(15.46) 


and Eq. 15.41 for the second derivative. Note that the integrated part of Eq. 
15.39 vanishes. The wave has not yet gone to oo. Since no derivatives with 


1 Apart from cases such as Exercise 15.3.6, /(x) must vanish as x ±oo 

in order for the Fourier transform of f(x ) to exist. 
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respect to a appear, Eq. 15.45 is actually an ordinary differential equation — in 
fact, the linear oscillator equation. This transformation, from a partial to an 
ordinary differential equation, is a significant achievement. We solve Eq. 15.45 
subject to the appropriate initial conditions. At t = 0, applying Eq. 15.43, Eq. 
15.46 reduces to 


Y(oc, 0) = - 


1 


f(x)e l(XX dx 


V^J- 

= F(a). 

The general solution of Eq. 15.45 in exponential form is 

Y(oe,r) = F(a)e ±ivat . 

Using the inversion formula (Eq. 15.23), we have 

1 


and, by Eq. 15.48, 


y(x, t) = 


y(x, t) = 




Y{ a, t)e~ iax da 


1 


^/2tt 


F(a)e~ ia(x+Vt) d(x. 


Since f(x) is the Fourier inverse transform of F(a), 

y(x, t ) = /(x + i?t), 


(15.47) 


(15.48) 


(15.49) 


(15.50) 


(15.51) 


corresponding to waves advancing in the +x- and — x-directions, respectively. 

The particular linear combinations of waves is given by the boundary con- 
dition of Eq. 15.43 and some other boundary condition such as a restriction on 
dy/dt. 

The accomplishment of the Fourier transform here deserves special emphasis. 
Our Fourier transform converted a partial differential equation into an ordinary 
differential equation, where the “degree of transcendence” of the problem was 
reduced. In Section 15.9 Laplace transforms are used to convert ordinary 
differential equations (with constant coefficients) into algebraic equations. 
Again, the degree of transcendence is reduced. The problem is simplified — as 
outlined in Fig. 15.1. 


EXERCISES 

1 5.4.1 The one-dimensional Fermi age equation for the diffusion of neutrons slowing 
down in some medium (such as graphite) is 

d 2 q(x,z) _ dq(x, t) 
dx 2 ~ dx * 


Here q is the number of neutrons that slow down, falling below some given 
energy per second per unit volume. The Fermi age, t, is a measure of the energy 
loss. 
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If q{x , 0) = SS(x\ corresponding to a plane source of neutrons at x = 0, 
emitting S neutrons per unit area per second, derive the solution 

-x 2 /4t 

q = S e - 


yj4nT 


Hint. Replace q(x , t) with 

p(k, t) = 


q(x , r)e ikx dx . 


15.4.2 


V2SJ- 

This is analogous to the diffusion of heat in an infinite medium. 

Equation 15.41 yields 

g 2 («j) = —w 2 g(w) 

for the Fourier transform of the second derivative of f(x). The condition 
f(x) -► 0 for x ±oo may be relaxed slightly. Find the least restrictive Condition 
for the preceding equation for g 2 (a>) to hold. 

\df(x) 


ANS. 


dx 


i(qf(x ) 


-0. 


1 5.4.3 The one-dimensional neutron diffusion equation with a (plane) source is 

_ D d^C) + K 2 D = Qd(x ^ 
dx 

where cp(x) is the neutron flux, QS(x) is the (plane) source at x = 0, and D and 
K 2 are constants. Apply a Fourier transform. Solve the equation in transform 
space. Transform your solution back into x-space. 

ANS. V (x) = 

15.4.3A For a point source at the origin the three-dimensional neutron diffusion 
equation becomes 

-D\ 2 cp( r) ± K 2 D<p( r) = QS( r). 

Apply a three-dimensional Fourier transform. Solve the transformed equa- 
tion. Transform the solution back into r-space. 

15.4.4 (a) Given that F(k) is the three-dimensional Fourier transform of /( r) and 

Fj( k) is the three-dimensional Fourier transform of V/(r), show that 

F 1 (k) = (-ik)F(k). 

This is a three-dimensional generalization of Eq. 15.40. 

(b) Show that the three-dimensional Fourier transform of V * V/(r) is 

F 2 (k) = ( — ik) 2 F(k). 

Note. Vector k is not the unit vector along the z-axis. It is a vector in the 
transform space. In Section 15.6 we shall have hk — p, linear momentum. 


15.5 CONVOLUTION THEOREM 

We shall employ convolutions to solve differential equations, to normalize 
momentum wave functions (Section 15.6), and to investigate transfer functions 
(Section 15.7). 
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FIG. 15.4 


Let us consider two functions/(x) and g(x ) with Fourier transforms F(t) and 
G(t), respectively. We define the operation 


f*g = 


1 

yj2n 



g(y)f(x - y)dy 


(15.52) 


as the convolution of the two functions / and g over the interval ( — oo, oo). This 
form of an integral appears in probability theory in the determination of the 
probability density of two random, independent variables. Our solution of 
Poisson’s equation, Eq. 8.99, may be interpreted as a convolution of a charge 
distribution, p(r 2 ), and a weighting function, (4^0^ — r 2 |) _1 . In other works 
this is sometimes referred to as the Faltung , to use the German term for 
“folding.” 1 We now transform the integral in Eq. 15.52 by introducing the 
Fourier transforms: 



- y)dy = 


1 


1 

yjln 


g(y) 

Fit) 


F(t)e~‘ Hx ~ y) dtdy 


g(y)e lty dy \e ltx dt 


(15.53) 


F(t)G(t)e~ itx dt , 


interchanging the order of integration and transforming g(y). This result may be 
interpreted as follows: The Fourier inverse transform of a product of Fourier 
transforms is the convolution of the original functions,/*#. 

For the special case x = 0 we have 



F(t)G(t)dt = 



f(-y)g{y)dy. 


(15.54) 


1 For f{y) = e~ y ,f(y) and f(x — y ) are plotted in Fig. 15.4. Clearly, /'(y) and 
— y) are mirror images of each other in relation to the vertical line y = x/2, 
that is, we could generate f(x — y) by folding over f(y) on the line y = x/2. 
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The minus sign in — y suggests that modifications be tried. We now do this with 
g* instead of g using a different technique. 

ParsevaTs Relation 

Results analogous to Eqs. 15.53 and 15.54 may be derived for the Fourier sine 
and cosine transforms (Exercises 15.5.1 and 15.5.2). Equation 15.54 and the cor- 
responding sine and cosine convolutions are often labeled “Parseval’s relations” 
by analogy with Parseval’s theorem for Fourier series (Chapter 14, Exercise 
14.4.2). 

The Parseval relation 2,3 



F(eo)G*(a)) dco 





(15.55) 


may be derived very beautifully using the Dirac delta function representation, 
Eq. 15.2W. We have 



f(t)g*(t)dt 



F(co)e i<ot dco • 


1 

y/2 n 



G*(x)e ix, dxdt, (15.56) 


with attention to the complex conjugation in the G*(x) to g*(t) transform. In- 
tegrating over t first, and using Eq. 15.2 Id, we obtain 


f*o 

J — 


f{t)g*(t)dt = 


* 

) 

«. - 


F(co) G*(x)S(x — a>)dxdco 


-f 


(15.57) 


F(m)G*(m) dco. 


our desired Parseval relation. If /(f) = g(t\ then the integrals in the Parseval 
relation are normalization integrals (Section 9.4). Equation 15.57 guarantees 
that if a function /(f) is normalized to unity, its transform F(co) is likewise nor- 
malized to unity. This is extremely important in quantum mechanics as de- 
veloped in the next section. 

It may be shown that the Fourier transform is a unitary operation (in the 
Hilbert space L 2 3 , square integrable functions). The Parseval relation is a 
reflection of this unitary property — analogous to Exercise 4.5.26 for matrices. 

In Fraunhofer diffraction optics the diffraction pattern (amplitude) appears 
as the transform of the function describing the aperture (compare Exercise 
15.5.5). With intensity proportional to the square of the amplitude the Parseval 
relation implies that the energy passing through the aperture seems to be some- 
where in the diffraction pattern — a statement of the conservation of energy. 

Parseval’s relations may be developed independently of the inverse Fourier 
transform and then used rigorously to derive the inverse transform. Details are 
given by Morse and Feshbach, 4 Section 4.8 (see also Exercise 15.5.4). 


2 Note that all arguments are positive in contrast to Eq. 15.54. 

3 Some authors prefer to restrict Parseval’s name to series and refer to Eq. 
15.55 as Rayleigh’s theorem. 

4 P. M. Morse, and H. Feshbach, Methods of Theoretical Physics. New York: 
McGraw-Hill (1953). 
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EXERCISES 


1 5.5.1 Work out the convolution equation corresponding to Eq. 15.53 for 
(a) Fourier sine transforms 


■»oo 

1 

2 

J — oc 


g(y)f(x — y)dy = - 


F s (s)G s (s) cos sx ds, 


where / and g are odd functions, 
(b) Fourier cosine transforms 


g(y)f(x -y)dy- 

3 

where / and g are even functions. 


/»00 

j F«(s)G f i 


(s) cos sx ds , 


15.5.2 


F(p) and G(p) are the Hankel transforms of /(r) and g(r\ respectively (Exercise 
15.1.1). Derive the Hankel transform Parseval relation: 


'*00 

F*(p)G(p)pdp 

0 


'*00 

f*(r)g(r)rdr. 

0 


15.5.3 Show that for both Fourier sine and Fourier cosine transforms ParsevaPs relation 
has the form 

/» oo /*oo 

F(t)G(t)dt= f(y)g(y)dy. 

Jo Jo 


15.5.4 Starting from ParsevaPs relation (Eq. 15.54), let g(y) = 1, 0 < y < oc, and zero 
elsewhere. From this derive the Fourier inverse transform (Eq. 15.23). 

Hint. Differentiate with respect to a. 


1 5.5.5 (a) A rectangular pulse is described by 

fl, Ixl < a 

/(X) = |o ; |x| > a. 

Show that the Fourier exponential transform is 

(2 sin at 


F(t) = 


' % t 


Here is the single slit diffraction problem of physical optics. The slit is 
described by /(x). The diffraction pattern amplitude is given by the Fourier 
transform F(t). 

(b) Use the Parseval relation to evaluate 


This integral may also be evaluated by using the calculus of residues, 
Exercise 7.2.12. 

ANS. (b) 7t. 


15.5.6 Solve Poisson’s equation V 2 i/^(r) = -p(r)/e 0 by the following sequence of 
operations: 

(a) Take the Fourier transform of both sides of this equation. Solve for the 
Fourier transform of ij/( r). 

(b) Carry out the Fourier inverse transform by using a three-dimensional 
analog of the convolution theorem, Eq. 15.53. 
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15 . 5.7 (a) Given /(x) = 1 — |x/2|, — 2 < x < 2 and zero elsewhere, show that the 
Fourier transform of /(x) is F(t ) — 

(b) Using the Parseval relation, evaluate 


sinf 


dt. 


ANS. (b) y. 

1 5 . 5.8 With F(t) and G(t) the Fourier transforms of /(x) and g(x), respectively, show that 


r 


I fix) - g(x)\ 2 dx = 


l-F(f) — G(t)\ 2 dt. 


If g(x) is an approximation to /(x), the preceding relation indicates that the 
mean square deviation in r-space is equal to the mean square deviation in x-space. 


1 5 . 5.9 Use the Parseval relation to evaluate 

r °° dco 


(a) 


(b) 


o + a 2 r 

co 2 dco 


0 (co 2 + a 2 ) 2 ' 

Hint. Compare Exercise 15.3.4. 


ANS. (a) 
(b) 


71 

4a 2 ’ 

71 

4a 


15.6 MOMENTUM REPRESENTATION 


In advanced dynamics and in quantum mechanics linear momentum and 
spacial position occur on an equal footing. In this section we shall start with the 
usual space distribution and derive the corresponding momentum distribution. 
For the one-dimensional case our wave function a solution of the Schrodin- 

ger wave equation, has the following properties: 

1. il/*(x)ij/(x)dx is the probability of finding the quan- 
tum particle between x and x + dx and 


2. 



il/(x)dx = 1, 


(15.58) 


corresponding to one particle (along the x-axis). 


In addition, we have 
3. <x> 


\l/*(x) x \l/(x)dx 


(15.59) 


for the average position of the particle along the x- 
axis. This is often called an expectation value. 
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We want a function g(p ) that will give the same information about the 
momentum. 

1. g*(p)g(p)dp is the probability that our quantum 
particle has a momentum between p and p + dp. 


2 . 

3. 



g*(p)g(p)dp = l. 


<p> 



g*(p)pg(p)dp. 


(15.60) 

(15.61) 


As subsequently shown, such a function is given by the Fourier transform of our 
space function i j/(x). Specifically, 1 


g(p) = 
g*(p) = 


y/2 nh J 

1 f 


ipxlh dx 


i jj*{x)e ipxlh dx. 


(15.62) 

(15.63) 


The corresponding three-dimensional momentum function is 


0(P) = 


{2nh) 3/2 


\j/( r)e^ p/h d 3 x 


To verify Eqs. 15.62 and 15.63, let us check on properties 2 and 3. 

Property 2, the normalization, is automatically satisfied as a Parseval rela- 
tion, Eq. 15.55. If the space function \j/(x) is normalized to unity, the momentum 
function g(p) is also normalized to unity. 

To check on property 3, we must show that 


<P> = 



g*(p)p g(p)dp = 


o° t 1 

\jj * (x) — — if/ (x) dx, 
i dx 


(15.64) 


where ( h/i)(d/dx ) is the momentum operator in the space representation. We 
replace the momentum functions by Fourier transformed space functions, and 
the first integral becomes 


00 



oo 


i P (x x 'y h l i / *( x r> ni / ( x ) ( ip C i x ' d x . 


(15.65) 


Now 


lr The h may be avoided by using the wave number k, p — kh (and p = kfr), 
so that 

= ^Jyj2 | Hx)e~ ikx dx. 

An example of this notation appears in Section 16.1. 
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pg-ip(x-x’)lh 


d_ 

dx 


h e - ip (x-x')ih 

i 


(15.66) 


Substituting into Eq. 15.65 and integrating by parts, holding x' and p constant, 
we obtain 


<P> = 


' 1 
2nh 


?~ip{x~x')/h 


dp 


• ~-il/(x)dx f dx. (15.67) 

i dx 


Here we assume i j/(x) vanishes as x ±oo, eliminating the integrated part. 
Again using the Dirac delta function, Eq. 15.21c, Eq. 15.67 reduces to Eq. 15.64 
to verify our momentum representation. The reader will note that technically we 
have employed the inverse Fourier transform in Eq. 15.62. This was chosen 
deliberately to yield the proper sign in Eq. 15.67. 


EXAMPLE 15.6.1 Hydrogen Atom 

The hydrogen atom ground state 2 may be described by the spatial wave 
function 


/ I \ 1/2 

*A( r ) = f — 3 I e' r/fl °> (15.68) 

a 0 being the Bohr radius, h 2 /me 2 . We now have a three-dimensional wave func- 
tion. The transform corresponding to Eq. 15.62 is 


0(P) = 


1 


d 3 r . 


(2nhf 2 ^ 

Substituting Eq. 15.68 into Eq. 15.69 and using 

-ar+ib-r rf3 r _ & na 


(a 2 + b 2 ) 


2\2 ’ 


(15.69) 


(15.70) 


we obtain the hydrogenic momentum wave function 



a 2/2 h 5/1 
(alp 2 + h 2 ) 2 ' 


(15.71) 


Such momentum functions have been found useful in problems like Compton 
scattering from atomic electrons, the wavelength distribution of the scattered 
radiation, depending on the momentum distribution of the target electrons. 

The relation between the ordinary space representation and the momentum 
representation may be clarified by considering the basic commutation relations 
of quantum mechanics. We can go from a classical Hamiltonian to the Schrodin- 


2 See E. V. Ivash, “A momentum representation treatment of the hydrogen 
atom problem,” Am. J. Phys. 40 , 1095 (1972) for a momentum representation 

treatment of the hydrogen atom, / = 0 states. 
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ger wave equation by requiring that momentum p and position x not commute. 
Instead, we require that 

[p,x] = (px — xp) = —ih. (15.72) 

For the multidimensional case Eq. 15.72 is replaced by 

= -ihdij. (15.73) 

The Schrodinger (space) representation is obtained by using 


Pr 


-ih 


(x) 


8x2 


replacing the momentum by a partial space derivative. The reader will easily 
see that 


[p,x]i^(x) = —ih\j/(x). 

However, Eq. 15.72 can equally well be satisfied by using 

x - -> ih — , 

8pj 


(15.74) 


(i P ) 


Pi Pi- 

This is the momentum representation. Then 

[p,x]g(p)= -ihg(p). (15.75) 

Hence the representation (x) is not unique; (p) is an alternate possibility. 

In general, the Schrodinger representation (x) leading to the Schrodinger 
wave equation is more convenient because the potential energy V is generally 
given as a function of position V(x,y,z). The momentum representation (p) 
usually leads to an integral equation (compare Chapter 16 for the pros and cons 
of the integral equations). For an exception, consider the harmonic oscillator. 


EXAMPLE 15.6.2 Harmonic Oscillator 

The classical Hamiltonian (kinetic energy + potential energy = total energy) 
is 

H{p,x) = 4- \kx 2 = E, (15.76) 

2m 2 

Where k is the Hooke’s law constant. 

In the Schrodinger representation we obtain 

+ ifex 2 ^(x) = Eil/(x). (15.77) 

2m ax 2 


For total energy E equal to yJ(k/m)h/2 there is a solution (Section 13.1) 
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xj/(x) = e 

(15.78) 

The momentum representation leads to 


p 2 ( x h 2 kd 2 g{p) 

2 m M 2 ii- ~ Egipy 

(15.79) 

Again, for 


£- /-5 

(15.80) 

\j m 2 

the momentum wave equation (15.79) is satisfied by 


g(p) = e ~P 2 K2hVmk)^ 

(15.81) 


Either representation, space or momentum (and an infinite number of other 
possibilities), may be u$ed, depending on which is more convenient for the par- 
ticular problem undtr attack. 

The demonstration that g(p) is the momentum wave function corresponding 
to Eq. 15.78 — that it is the Fourier inverse transform of Eq. 15.78 — is left as 
Exercise 15.6.3. 


EXERCISES 

1 5.6.1 The function e ,kr describes a plane wave of momentum p = hk normalized to 
unit density. (Time dependence of e~ ia)t is assumed.) Show that these plane wave 
functions satisfy an orthogonality relation 

/„-*-***.<*■*-« 

15.6.2 An infinite plane wave in quantum mechanics may be represented by the 
function 

= e lpxlh . 

Find the corresponding momentum distribution function. Note that it has an 
infinity and that is not normalized. 

1 5.6.3 A linear quantum oscillator in its ground state has a wave function 

= a~ xl2 n' m e~ x2l2a \ 

Show that the corresponding momentum function is 

g(p) = 

1 5.6.4 The nth excited state of the linear quantum oscillator is described by 

\lf n (x) = a~ llz 2~^ 2 n~ il4 (n\)~ lf2 e~ x2l2a2 H n (x/a); 

where H n (x/a) is the nth Hermite polynomial, Section 13.1. As an extension 
of Exercise 15.6.3, find the momentum function corresponding to \p n (x). 

Hint. \j/ n (x) may be represented by if" ^ 0 (x), where if + is the raising operator, 
Exercise 13.1.16. 

1 5.6.5 A free particle in quantum mechanics is described by a plane wave 
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15.6.6 

15.6.7 

15.6.8 


15.6.9 

15.6.10 


k _ p i[kx-(hk 2 /2m)t ] 


Combining waves of adjacent momentum with an amplitude weighting factor 
c p(k ), we form a wave packet 


(a) 


V(x,0= J" 

Solve for cp(k) given that 


<p(k)e 


l[kx-(hk 2 l2m)t ] 


dk. 


¥(*,0) = e 


_ p ~x 2 !2 a 2 


(b) Using the known value of <p(/c), integrate to get the explicit form of ^(x, t). 


Note that this wave packet diffuses or spreads out with time. 


ANS. ¥(x,t) = - 


-{x 2 l2[(a 2 +(i h /m)t]} 


[1 + (iht/ma 2 )~\ 1/2 

Note. An interesting discussion of this problem from the evolution operator 
point of view is given by S. M. Blinder, “Evolution of a Gaussian wavepacket.” 
Am. J. Phys.36 , 525 (1968). 


Find the time-dependent momentum wave function g(k, t) corresponding to 
of Exercise 15.6.5. Show that the momentum wave packet g*{k,t)g(k,t) 
is independent of time. 

The deuteron, Example 9.1.2, may be described reasonably well with a Hulthen 
wave function 

<A(r) = T[e“ ar - <T* r ]/r, 

with A, a, and ft constants. Find $(p) the corresponding momentum function. 
Note. The Fourier transform may be rewritten as Fourier sine and cosine 
transforms or as a Laplace transform, Section 15.8. 


The nuclear form factor F(k) and the charge distribution p(r) are three-dimen- 
sional Fourier transforms of each other: 


F(k) = 


If the measured form factor is 


(2nf 2 J 


p(r)e ik ’ f d l r. 


F(k) = (2n)' 3l2 l 1 + 


find the corresponding charge distribution. 


ANS. p(r) = 


4n r 


Check the normalization of the hydrogen momentum wave function 

, , 2 3/2 a 3 0 l2 h 512 

£AP) (2 2 i f 2\2 

n (alp 1 + h 1 Y 

by direct evaluation of the integral 

1 9*(P)9(P)d 3 p ■ 

With ij/(r) a wave function in ordinary space and cp( p) the corresponding mo- 
mentum function, show that 

(a) ( 2 J . )3/2 J* Tip(r)e ~ lri '" 1 d 3 x = ih\ p <p( p). 
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(b) I r2me ,r ’ /# d * x = {mp)Z(p{p) - 

Note . \ p is the gradient in momentum space: 

. d . d ■ d 

h J V + 

^Py ^Pz 

These results may be extended to any positive integer power of r and therefore 
to any (analytic) function that may be expanded as a Maclaurin series in r. 


1 5.6.1 1 The ordinary space wave function i^(r, t) satisfies the time-dependent Schrodin- 
ger equation 

dt 2m 

Show that the corresponding time-dependent momentum function satisfies the 
analogous equation 

dt 2m 

Note. Assume that V(r) may be expressed by a Maclaurin series and use Exercise 
15.6.10. 

V(ihV p ) is the same function of the variable ih\ p that F(r) is of the variable r. 


15.6.12 The one-dimensional time-independent Schrodinger wave equation is 

h 2 d 2 i/j{x) 


2m dx 2 


- + V{x)\l/(x) = E\j/(x). 


For the special case of V(x) an analytic function of x, show that the corresponding 
momentum wave equation is 


V ( if, 7p) g ^ P ^ + 2 ’rn 9 ^ = Eg ^' 

Derive this momentum wave equation from the Fourier transform, Eq. 15.62, 
and its inverse. Do not use the substitution x -► ih(d/dp) directly. 


1 5.7 TRANSFER FUNCTIONS 

A time-dependent electrical pulse may be regarded as built-up as a super- 
position of plane waves of many frequencies. For angular frequency we have a 
contribution 


F{ a))e i0}t . 

Then the complete pulse may be written as 

1 


m = 


2n 


F(co)e l<at dco. 


(15.82) 


Because the angular frequency co is related to the linear frequency v by 


v = 


CD 


2n 9 
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gi 0 


Output 


/(/) 

^ 

Input 


FIG. 15.5 


it is customary to associate the entire l/2n factor with this integral. 

But if co is a frequency, what about the negative frequencies? The negative 
cos may be looked on as a mathematical device to avoid dealing with two func- 
tions (coscot and sin cot) separately (compare Section 14.1). 

Because Eq. 15.82 has the form of a Fourier transform, we may solve for F(co) 
by writing the inverse transform 


F(co) = 


poo 

f(t)e~ iw, dt. 


J — 00 


(15.83) 


Equation 15.83 represents a resolution of the puls e/(t) into its angular frequency 
components. Equation 15.82 is a synthesis of the pulse from its components. 

Consider some device such as a servomechanism or a stereo amplifier (Fig. 
15.5) with an input /(f) and an output g(t). For an input of a single frequency co, 
fjt) = e l( °\ the amplifier will alter the amplitude and may also change the phase. 
The changes will probably depend on the frequency. Hence 


gjt) = (p{co)f m (t). 


(15.84) 


This amplitude and phase modifying function, <p( co), is called a transfer function. 
It usually will be complex: 


c p((D ) — m(co) -F zT(co), 


(15.85) 


where the functions w(co) and v(oj) are real. 

In Eq. 15.84 we assume that the transfer function cp(aj) is independent of input 
amplitude and of the presence or absence of any other frequency components. 
That is, we are assuming a linear mapping of f(t) onto g(t). Then the total output 
may be obtained by integrating over the entire input, as modified by the amplifier 

0 (t) = Ff <p(<j))F(o))e ,mt do. (15.86) 

J — 00 

The transfer function is characteristic of the amplifier. Once the transfer 
function is known (measured or Calculated), the output g(t) can be calculated for 
any input /(t). Let us consider <p(co ) as the Fourier (inverse) transform of some 
function <b(t) 

/*00 

(p((o)= O (t)e- iM dt. (15.87) 

J - 00 

Then Eq. 15.86 is the Fourier transform of two inverse transforms. From Section 
15.5 we obtain the convolution 
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g(t) = 


Arm - r)dx. 


(15.88) 


Interpreting Eq. 15.88, we have an input — a “cause”— /(r), modified by 
0(f — t), producing an output — an “effect” — g(t). Adopting the concept of 
causality — that the cause precedes the effect — we must require r < t. We do 
this by requiring 


<E>(t — t) = 0, r > t. 


Then Eq. 15.88 becomes 


g{t) = 


/(T)<&(r - t) Ax. 


(15.89) 


(15.90) 


The adoption of Eq. 15.89 has profound consequences here and equivalently 
in the dispersion theory. Section 7.3. 


Significance of 

To see the significance of <I>, let fix) be a sudden impulse starting at t — 0, 

f{x) = 5(t), 


where <5 (t) is a Dirac delta distribution on the positive side of the origin. Then 
Eq. 15.90 becomes 


g{ 0 = 


<5(x)0(f — x)dx 


J — 00 


mt), t > o 
{o, t < o. 


(15.91) 


This identifies d>(t) as the output function corresponding to a unit impulse at 
t = 0. Equation 15.91 also serves to establish that d>(t) is real. Our original 
transfer function gives the steady-state output corresponding to a unit amplitude 
single frequency input. d>(f) and <p(co) are Fourier transforms of each other. 

From Eq. 15.87 we now have 


<p(o)) = 


0(r)e“ ,<0, dt, 


(15.92) 


with the lower limit set equal to zero by causality (Eq. 15.89). With <t>(r) real from 
Eq. 15.91 we separate real and imaginary parts and write 


u(co) = (b(t)coswtdt 

Jo 

r*CO 

V (a>) — — (5>(t) sin (otdt, t > 0. 


(15.93) 


Jo 

From this we see that the real part of cp( a>), u(co) is even, whereas the odd part of 
<p(co), i>(co) is odd: 
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u( — co) = u(o) 
v( — o) — — v(o). 

Compare this result with Exercise 15.3.1. 

Interpreting Eq. 15.93 as Fourier cosine and sine transforms, we have 

2 


o(t) 


u(o) coscot do 


v(o) sin cot dco , 


(15.94) 


t > 0. 


Combining Eqs. 15.93 and 15.94, we obtain 


v(co) = 


■>oo 

sin cot 
Jo 


- u(o') cos o' t do' 

71 


Idr, 


(15.95) 


showing that if our transfer function has a real part, it will also have an imaginary 
part (and vice versa). Of course, this assumes that the Fourier transforms exist, 
thus excluding cases such as 0(co) = 1. 

The imposition of causality has led to a mutual interdependence of the real 
and imaginary parts of the transfer function. The reader should compare this 
with the results of the dispersion theory of Section 7.3, also involving causality. 

It may be helpful to show that the parity properties of u(co) and v(co) require 
<b(t) to vanish for negative t . Inverting Eq. 15.87, we have 

1 f 00 

<3>(t) = — [u(o) + i i>(co)] [cos cot + i sin cot ] do. (15.96) 

2,71 I 

J - oo 


With u(o) even and v(o) odd, Eq. 15.96 becomes 

1 f 00 

u(o) cos ot do r(co) sin ot do, 

71 Jo 



t > 0. (15.97) 


From Eq. 15.94 


f 


u(co) cos <x>t dco = — t>(co) sin cot dco, t > 0. 


(15.98) 


If we reverse the sign of t, sin cot reverses sign and from Eq. 15.97 


0(t) = 0, t < 0 

(demonstrating the internal consistency of our analysis). 


EXERCISE 


1 5.7.1 Derive the convolution 


g(t) = 



/(T)O(t - T )dx. 
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1 5.8 ELEMENTARY LAPLACE TRANSFORMS 


Definition 

The Laplace transform f(s ) or J/' of a function F(t) is defined by 1 


f(s) = JF{F(t)j = lim 


e~ st F(t)dt 


(15.99) 


e- s, F{t)dt. 


A few comments on the existence of the integral might be in order. The infinite 
integral of F(t), 

r*oo 

F(t)dt, 

Jo 

need not exist. For instance, F(t) may diverge exponentially for large t. However, 
if there is some constant s 0 such that 

\e~ s o‘F(t)\<M, (15.100) 

a positive constant for sufficiently large t, t> t 0 , the Laplace transform (Eq. 
15.99), will exist for s> s 0 ;F(t) is said to be of exponential order. As a counter- 
example, F(t) = e‘ 2 does not satisfy the condition given by Eq. 15.100 and is not 

of exponential order. FF{e' 2 } does not exist. 

The Laplace transform may also fail to exist because of a sufficiently strong 
singularity in the function F(t) as t -> 0; that is, 

f*O0 

e~ st t n dt 

Jo 

diverges at the origin for n < — 1. The Laplace transform JF{t n ) does not exist 
for n < — 1. 

Since, for two functions F(t) and G(t), for which the integrals exist 

<£{aF(t) + bG(t)} = aSF{F{i)} 4- b&{G(t)}, (15.101) 

the operation denoted by $£ is linear. 


Elementary Functions 

To introduce the Laplace transform, let us apply the operation to some of the 
elementary functions. In all cases we assume that F(t) = 0 for t < 0. 

F(t) = 1, t > 0. 


r This is sometimes called a one-sided Laplace transform; the integral from 
-oo to 4-oo is referred to as a two-sided Laplace transform. Some authors 
introduce an additional factor of 5 . This extra 5 appears to have little advantage 
and continually gets in the way (compare Jeffreys and Jeffreys, Section 14.13 
for additional comments). Generally, we take s to be real and positive. It is 
possible to have 5 complex provided &(s) > 0. 
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Then 




e st dt = -, for s > 0. 

5 


(15.102) 


Again, let 

F(t) = e k \ t > 0. 
The Laplace transform becomes 

1 


f 


<£{e kt }= e~ st e kt dt = 


s — k' 


for s > k. 


(15.103) 


Using this relation, we may easily obtain the Laplace transform of certain 
other functions. Since 


we have 


cosh kt — y( e kt + e kt \ 
sinh/ct = \(e kt — e~ ki \ 


if {cosh kt} = ^ ( — — - + 1 


(15.104) 


2\s — k s + k s 2 -/c 


if {sinh /a} = 


1 / 1 


1 


(15.105) 


2\s — k s k J s 2 -/c 2 ’ 
both valid for s > k. We have the relations 

cos kt = cosh ikt , 
sin kt — —i sinh ikt. 

Using Eqs. 15.88 with k replaced by ik , we find that the Laplace transforms are 


(15.106) 


^{ cosfet } = ^r^ 

^{sin kt} = - 2 ~ 


(15.107) 


both valid for s > 0. Another derivation of this last transform is given in the 
next section. Note that lim s ^ 0 if {sin kt} = 1/fc. The Laplace transform assigns a 
value of 1 jk to Jo sin/ctdt. 

Finally, for F(t) = t n , we have 


if{t"} - 


*t n dt 9 


which is just the factorial function. Hence 


$£{t n } 


n ! 


s > 0, n> — 1. 


(15.108) 
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The reader will note that in all these transforms we have the variable s in the 
denominator — negative powers of s. In particular, = 0. The sig- 
nificance of this point is that if f(s ) involves positive powers of -► oo), 

then no inverse transform exists. 


Inverse Transform 

There is little importance to these operations unless we can carry out the 
inverse transform, as in Fourier transforms. That is, with 

&{ m ) =/h 

then 

<?' 1 {f(s)} = F(t). (15.109) 

Taken literally, this inverse transform is not unique. Two functions F^t) and 
F 2 (t) may have the same transform, /(s). However, in this case 

Fi(t) ~ F 2 (t ) = N(t ), 

where N(t) is a null function (Fig. 15.6) indicating that 

\ tO N{t)dt = 0. 

Jo 

for all positive t 0 . This result is known as Lerch’s theorem. Therefore to the 
physicist and engineer N(t) may almost always be taken as zero and the inverse 
operation becomes unique. 


V(o 



FIG. 15.6 A possible null function 


The inverse transform can be determined in various ways. (1) A table of 
transforms can be built-up and used to carry out the inverse transformation 
exactly as a table of logarithms can be used to look up antilogarithms. The pre- 
ceding transforms constitute the embryonic beginnings of such a table. For a 
more complete set of Laplace transforms see Table 15.2 or AMS-55, Chapter 29. 
Employing partial fraction expansions and various operational theorems, which 
are considered in succeeding sections, may facilitate use of the tables. There is 
some justification for suspecting that these tables are probably of more value in 
solving textbook exercises than in solving real-world problems. (2) A general 
technique for J*? -1 will be developed in Section 15.12 by using the calculus of 
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residues. (3) The difficulties and the possibilities of a numerical approach — 
numerical inversion — are considered at the end of this section. 


Partial Fraction Expansion 

Utilization of a table of transforms (or inverse transforms) is facilitated by 
expanding /(s) in partial fractions. 

Frequently /(s), our transform, occurs in the form g(s)/h(s\ where g(s) and 
h(s) are polynomials with no common factors, g(s) being of lower degree than 
h(s). If the factors of h(s) are all linear and distinct, then by the theory of partial 
fractions we may write 


/(s) = — + • • • + — , (15.110) 

s ~ a x s ~ a 2 s — a n 

where the q’s are independent of 5. The a- s are the roots of h(s). If any one of the 
roots, say a u is multiple (occurring m times), then f(s) has the form 


f(s) = r~^z+ C 


1 ,m — 1 


(5 — aj n ' ( s — a A ) m 1 5 — a 1 " 2 s ~ &i 

Finally, if one of the factors is quadratic, (s 2 + ps + q\ the numerator, instead of 
being a simple constant, will have the form 

as + b 


+ ••*+- 


+ z- 


C: 


(15.111) 


s z + ps q 


There are various ways of determining the constants introduced. For in- 
stance, in Eq. 15.110 we may multiply through by (s — a,) and obtain 


Ct = lim(s - aj/(s). 


(15.112) 


In elementary cases a direct solution is often the easiest. 


EXAMPLE 15.8.1 Partial Fraction Expansion 
Let 


f( — fc 2 __ c as -f b 
s(s 2 -f k 2 ) s s 2 -h k 2 


(15.113) 


Putting the right side of the equation over a common denominator and equating 
like powers of s in the numerator, we obtain 


k 2 . 

s(s 2 + k 2 ) 


c(s 2 + k 2 ) + s(as + b) 
s(s 2 + k 2 ) 


s 1 , 


(15.114) 


and 


c + a = 0, 
b = 0, 
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ck 2 = k 2 , 5 °. 

Solving these (s =£ 0), we have 

c = 1, 

6 = 0 , 
a = — 1, 


giving 


and 



5 

s 2 + k 2 ’ 


&~ x {f(s)} = 1 - cos to 


by Eqs. 15.102 and 15.107. 

EXAMPLE 15.8.2 A Step Function 


(15.115) 

(15.116) 


As one application of Laplace transforms, consider the evaluation of 

1 sin tx 


F(t) 


- dx . 


(15.117) 


Suppose we take the Laplace transform of this definite (and improper) 
integral : 


2 


f 


sin tx 
x 


dx > = 


sin tx 


dx dt. 


(15.118) 


Now interchanging the order of integration (which must be justified!), 2 we get 


1 


o 


e st sin txdt 


dx 


dx 


s 2 + x 2 ’ 


(15.119) 


since the factor in square brackets is just the Laplace transform of sintx. From 
the integral tables 


dx 


S 2 + X 2 5 


1* -l/X 
= -tan 1 1 - 


-Z-M 


(15.120) 


By Eq. 15.102 we carry out the inverse transformation to obtain 


F(t) = |, t > 0, (15.121) 

in agreement with an evaluation by the calculus of residues (Section 7.2). It has 
been assumed that t > 0 in F(t). For F(~t) we need note only that sin( — tx) = 
— sintx, giving F( — t) = — F(t). Finally, if t = 0, F(0) is clearly zero. Therefore 


See Jeffreys and Jeffreys, Chapter 1 (uniform convergence of integrals). 
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FIG. 15.7 F(t) = j°° S ^~-dx. 
a step function 


f 


sin tx 


dx = < 


71 „ 

r t>0 

o, t = o 


t < 0. 


= f[2n(0- !]• 


( 15 . 122 ) 


Note that jo (sin tx/x)dx, taken as a function of t, describes a step function 
(Fig. 15.7), a step of height 7i at t = 0. This is consistent with Eq. 8.111. 

The technique in the preceding example was to (1) introduce a second integra- 
tion- — the Laplace transform, (2) reverse the order of integration and integrate, 
and (3) take the inverse Laplace transform. There are many opportunities where 
this technique of reversing the order of integration can be applied and proved 
very useful. Exercise 15.8.6 is a variation of this. 


Numerical Inversion 

As an integration the Laplace transform is a highly stable operation — stable 
in the sense that small fluctuations (or errors) in F(t) are averaged out in the 
determination of the area under a curve. Also, the weighting factor, e~ st , means 
that the behavior of F(t) at large t is effectively ignored — unless 5 is small. As a 
result of these two effects, a large change in F ( t ) at large t indicates a very stnall, 
perhaps insignificant change, in/(s). In contrast to the Laplace transform opera- 
tion, going from f(s) to F(f), is highly unstable. A tiny change in/(s) may result 
in a wild variation of F(t). All significant figures may disappear. In a matrix 
formulation the matrix is ill-conditioned with respect to inversion. 

There is no general, completely satisfactory numerical method for inverting 
Laplace transforms. However, if we are willing to restrict attention to relatively 
smooth functions, various possibilities open up. Bellman, Kalaba, and Lockett 3 
convert the Laplace transform to a Mellin transform (x = e~ x ) and use numerical 
quadrature based on shifted Legendre polynomials, P*(x) — P n { 1 — 2x). The 
key step is analytic inversion of the resulting matrix. Krylov and Skoblya 4 focus 


3 R. Bellman, R. E. Kalaba, and J. A. Lockett, Numerical Inversion of the 
Laplace Transforms. New York: American Elsevier (1966). 

4 V. I. Krylov, and N. S. Skoblya, Handbook of Numerical Inversion of Laplace 
Transforms. Translated by D. Louvish. Jerusalem: Israel Program for Scien- 
tific Translations (1969). 



830 INTEGRAL TRANSFORMS 


on evaluation of the Bromwich integral (Section 15.12). As one technique, they 
replace the integrand with an interpolating polynomial of negative powers and 
integrate analytically. 


EXERCISES 


15.8.1 Prove that 


lims/(s) = lim F(t). 

s~* oo (-> + 0 

Hint. Assume that F(t ) can be expressed as F(t) = £^= 0 ^”* 


15.8.2 Show that 


- lim if {cos xt} — S(x). 

n s -* o 


15.8.3 Verify that 


se 


f cos at — cos bt 
\ b 2 -a 2 


(s 2 4- a 2 )(s 2 4 b 2 Y 


a 2 + b 2 . 


1 5.8.4 Using partial fraction expansions, show that 


(a) 

(b) jgr 1 


1 


(s 4- a)(s 4- b) 
s 

(s + a)(s 4- b) 


e~ at - 
b — a 

ae~ at — be' bt 
a — b 


1 5.8.5 Using partial fraction expansions, show that 


(a) if' 1 


1 


{s 2 + a 2 )(s 2 + b 2 ) 


1 


a 2 -b 2 


aj=b. 
a + b. 

sin at sin bt 
a b 


a 2 + b 2 , 


(b) <£ 1 \— 2771 77 i ^ 2 ^{asinat - bsinbt}, a 2 j=b 2 . 

{(s 2 4- a 2 ){s 2 4- b 2 ) J a 2 -b 2i 1 

15.8.6 The electrostatic potential of a charged conducting disk is known to have the 
general form (circular cylindrical coordinates) 


0(p, z) = 


e- k]z] J 0 (kp)f(k)dk, 


with f(k) unknown. At large distances (z oo) the potential must approach the 
Coulomb potential Q/4ns 0 z. Show that 

lim f(k) = 
k~*o 4ns 0 

Hint. You may set p — 0 and assume a Maclaurin expansion of f(k) or, using 
e~ kz , construct a delta sequence. 


15.8.7 Show that 

J cos s 


(a) 


'*00 

Jo 


els _ 

s v 2(v — 1)! cos(v7r/2)’ 


0 < v < 1, 
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l* oo • 

/ sins. k 

(b) ds — , 0 < v < 2. 

Jo * v 2(v — 1) ! sin(v7i/2) 

Why is v restricted to (0, 1) for (a), to (0,2) for (b)? These integrals may be 
interpreted as Fourier transforms of s~ v and as Mellin transforms of sins and 
cos s. 

Hint. Replace s~ v by a Laplace transform integral: &{t x ~ l }/(v — 1)!. Then 
integrate with respect to s. The resulting integral can be treated as a beta function 
(Section 10.4). 

1 5.8.8 A function F(t) can be expanded in a power series (Maclaurin); that is, 

m = I a„t\ 

n = 0 

Then 


&{F(t)} = 




= X>„f e s, t"dt. 
" Jo 


Show that /(s), the Laplace transform of F(r), contains no powers of s greater 
than s' 1 . Check your result by calculating and comment intelligently 

on this fiasco. 


1 5.9 LAPLACE TRANSFORM OF DERIVATIVES 


Perhaps the main application of Laplace transforms is in converting differen- 
tial equations into simpler forms that may be solved more easily. It will be seen, 
for instance, that coupled differential equations with constant coefficients trans- 
form to simultaneous linear algebraic equations. 

Let us transform the first derivative of F(t). 

&{F'(t)} = f e~ s, ~^dt 
Jo at 


Integrating by parts, we obtain 


Se{F'(t)} = e~ st F(t) 


+ s 


e st F(t)dt 


= sJ?{F(t)} - F(0). 


(15.123) 


Strictly speaking, F( 0) = F( + 0) 1 and dF/dt is required to be at least piecewise 
continuous for 0 < r < cxd. Naturally, both F(t) and its derivative must be such 
that the integrals do not diverge. Incidentally, Eq. 15.123 provides another proof 
of Exercise 15.8.1. 

An extension gives 


&{F i2) (t)} - s 2 ^{F(t)} - sF( + 0) - F'( + 0), (15.124) 


Zero is approached from the positive side. 
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^{F {n \t)} = s n £F{F{t)} - s n ~ l F( + 0) - s"~ 2 F'( + 0) - ■ • • - F ( ”~ 1) ( + 0). 

(15.125) 

The Laplace transform like the Fourier transform replaces differentiation with 
multiplication. In the following examples differential equations become alge- 
braic equations. The degree of transcendence is reduced, and the solution is 
simplified. Here is the power and the utility of the Laplace transform. But see 
Example 15.10.3 for what may happen if the coefficients are not constant. 

Note carefully how the initial conditions, F( + 0), F'( + 0), and so on, are 
incorporated into the transform. Equation 15.124 may be used to derive 
{sin kt}. We use the identity 

— k 2 sin kt = ~2 sin/cf. (15.126) 

Then, applying the Laplace transform operation, we have 

— k 2 ^ {sin kt} = ^|^ 2 s * n ^| 

= s 2 £?{sinkt} — ssin(0) — -^sinkt|,= 0 

Since sin(0) = 0 and d/dt sin kt\ t=0 = k, 

j£?{sin kt} = 2 - ■ 2 » 

1 J s 2 + k 2 


(15.127) 


(15.128) 


verifying Eq. 15.107. 

EXAMPLE 15.9.1 Simple Harmonic Oscillator 

As a simple but reasonably physical example, consider a mass m oscillating 
under the influence of an ideal spring, spring constant k. As usual, friction is 
neglected. Then Newton’s second law becomes 

m *™ + kX(t) = 0; (15.129) 


also 

X(0 ) = x 0 , 

X'(0) = 0. 

Applying the Laplace transform, we obtain 

mJSP + h&{X{t)} = 0, (15.130) 

and by use of Eq. 15.124 this becomes 

ms 2 x(s ) — msX 0 + k x(s) = 0, 


(15.131) 
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x( 5) = X 0 -2~^ — withcoo = —. (15.132) 

s + (Oq m 

From Eq. 15.107 this is seen to be the transform of cos co 0 t, which gives 

X(t) = X 0 cos a> 0 t, (15.133) 

as expected. 

EXAMPLE 15.9.2 Earth’s Nutation 


A somewhat more involved example is provided by the nutation of the earth’s 
poles (force-free precession). Treating the earth as a rigid (oblate) spheroid, the 
Euler equations of motion reduce to 



dY 

dt 


+ aX 


(15.134) 


where a 
X 
Y 

h 


[(/. - 4)//J<u z , 

c o y with angular velocity vector co = (cu x , co y , co z ) (Fig. 15.8), 

moment of inertia about the z-axis and I y = I x moment of inertia 
about x- (or y -) axis. 


z 



FIG. 15.8 


The z-axis coincides with the axis of symmetry of the earth. It differs from the 
axis for the earth’s daily rotation, co, by some 15 meters, measured at the poles. 
Transformation of these coupled differential equations yields 

s x(s) — X(0) = —a y(s), 

s y(s) — 7(0) = a x(s). 

Combining to eliminate y(s), we have 


(15.135) 
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s 2 x(s) — sX(0) -F aY( 0) = — a 2 x(s) 


or 

x(s) = - Y{ 0)^-2- 

(15.136) 


s + s + a 


Hence 

X(t) = X(0)cosat — 7(0) sin at 

(15.137) 

Similarly, 

7(t) = Z(0)sinat + 7(0) cos at. 

(15.138) 


This is seen to be a rotation of the vector ( X , 7) counterclockwise (for a > 0) 
about the z-axis with angle 6 — at and angular velocity a. 

A direct interpretation may be found by choosing the time axis so that 
7(0) = 0. Then 


X(t) = X(0) cos at, 
Y(t) = X(0)sinat, 


(15.139) 


which are the parametric equations for rotation of (X, Y) in a circular orbit of 
radius X(0), with angular velocity a in the counterclockwise sense. 

In the case of the earth’s angular velocity vector Z(0) is about 15 meters, 
whereas a, as defined here, corresponds to a period (2n/a) of some 300 days. 
Actually because of departures from the idealized rigid body assumed in setting 
up Euler’s equations, the period is about 427 days. 2 

If in Eq. 15.134 we set 


X(t) = L x , 


Y{t ) = L y , 

where L x and L y = the x- and y-components of the angular momentum L, 

a = 

9l ~ gyromagnetic ratio, 

B z = magnetic field (along the z-axis), 

Eq. 15.134 describes the Larmor precession of charged bodies in a uniform 
magnetic field, B z . 


Dirac Delta Function 

For use with differential equations one further transform is helpful — the 
Dirac delta function: 3 


2 D. Menzel, ed.. Fundamental Formulas of Physics , p. 695. Englewood Cliffs, 
NJ: Prentice-Hall (1955). 

3 Strictly speaking, the Dirac delta function is undefined. However, the integral 
over it is well defined. This approach is developed in Section 8.7 using delta 
sequences. 
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J?{S(t - t 0 )} = 



e~ s, S(t 


and for t 0 = 0 


t 0 )dt = e s \ 


for r 0 > 0, (15.140) 


&{8{t)} = l, (15.141) 

where it is assumed that we are using a representation of the delta function 
such that 


<5(t)df=l, <5(t) = 0, for t > 0. 


(15.142) 


As an alternate method, <5(0 may be considered the limit as e -> 0 of F(t), where 


m = 


0 , 

e“ 

0 , 


t < 0 , 

0 < t < e, 

t > E. 


(15.143) 


By direct calculation 

if{F(0}=^=^. (15.144) 

Taking the limit of the integral (instead of the integral of the limit), we have 

limJ2?{F(0} = 1 
£->0 

or Eq. 15.141 


&{m) = 1 - 

This delta function is frequently called the impulse function because it is so 
useful in decribing impulsive forces, that is, forces lasting only a short time. 

EXAMPLE 15.9.3 Impulsive Force 

Newton’s second law for impulsive force acting on a particle of mass m 
becomes 

m^J = P<5(0, (15.145) 


where P is a constant. 

Transforming, we obtain 

ms 2 x(s) - msX(0) - mX'(0) = P. (15.146) 

For a particle starting from rest 2f'(0) = 0. 4 We shall also lake X (0) = 0. Then 

x(s) = -^, (15.147) 

MS 

and 


4 This really should be X'( + 0). To include the effect of the impulse, consider 
that the impulse will occur at t = s and let e 0. 
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X(t) = — t, (15.148) 

m 

, a constant. (15.149) 

at m 

The effect of the impulse PS(t) is to transfer (instantaneously) P units of 
linear momentum to the particle. 

A similar analysis applies to the ballistic galvanometer. The torque on the 
galvanometer is given initially by fci, in which i is a pulse of current and k is a 
proportionality constant. Since i is of short duration, we set 

ki = kqd(t\ (15.150) 

where q is the total charge carried by the current i. Then, with / the moment 
of inertia, 

I^ = kqd(t), (15.151) 

and transforming as before, we find that the effect of the current pulse is a 
transfer of kq units of angular momentum to the galvanometer. 

EXERCISES 

1 5.9.1 Use the expression for the transform of a second derivative to obtain the transform 
of cos kt. 

15.9.2 A mass m is attached to one end of an unstretched spring, spring constant k. 
At time t — 0 the free end of the spring experiences a constant acceleration a, 
away from the mass. Using Laplace transforms, 

(a) Find the position x of m as a function of time. 

(b) Determine the limiting form of x(t) for small t. 

ANS. (a) x^^at 2 coscot) 



X 
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1 5 . 9.3 Radioactive nuclei decay according to the law 

dN_ 
dt 


-AN, 


N being the concentration of a given nuclide and A, the particular decay constant. 
This equation may be interpreted as stating that the rate of decay is proportional 
to the number of these radioactive nuclei present. They all decay independently. 
In a radioactive series of n different nuclides, starting with N l , 




dNj 
dt 

= A l N 1 - A 2 N 2 , and so on 
dt 


dN r 


dt 


- = A n - 1 N n _ 1 , 


stable. 


Find NM N 2 (t), and N 3 (t), n = 3, with N x ( 0) = N 0 , N 2 ( 0) = N 3 ( 0) = 0. 
ANS. N^t) = N 0 e~ Xlt 


N 2 (t) = N 0 


*i 


A 2 -A { 


(e 




N 3 (t) = N 0 1- 


A-i 


A, 


A 2 — A i 


a 2 -a x 


Find an approximate expression for N 2 and N 3 , valid for small t when A x ~ A 2 . 

ANS. N 2 « N 0 A 1 t 

N 3 *^^ 2 t 2 - 


Find approximate expressions for N 2 and N 3 , valid for large t, when 

(a) » J. 2 , ANS. (a) N 2 »N 0 e~^ 

(b) ^ « A 2 . N 3 x N 0 ( 1 - e-^'), At t » 1. 

(b) N 2 » N 0 ~-e~ i '\ 

N 3 » N 0 ( 1 - A 2 t » 1. 

1 5 . 9.4 The formation of an isotope in a nuclear reactor is given by 

dN 

—r 1 = nva x N 10 - A 2 N 2 (t) - nw 2 N 2 (t). 
dt 

Here the product nv is the neutron flux, neutrons per cubic centimeter, times 
centimeters per second mean velocity; a 1 and <r 2 (cm 2 ) are measures of the proba- 
bility of neutron absorption by the original isotope, concentration N l0 , which 
is assumed constant and the newly formed isotope, concentration N 2 , respectively. 
The radioactive decay constant for the isotope is A 2 . 

(a) Find the concentration N 2 of the new isotope as a function of time. 

(b) If the original element is Eu 153 , o x = 400 barns = 400 x 10~ 24 cm 2 , cr 2 = 
1000 barns = 1000 x lO' 24 cm 2 , and k 2 = 1.4 x 10~ 9 sec” 1 . If N l0 = 10 2 ° 
and (nv) = 10 9 cm -2 sec -1 , find iV 2 , the concentration of Eu 154 after one 
year of continuous irradiation. Is the assumption that N { is constant 
justified? 

1 5 . 9.5 In a nuclear reactor Xe 135 is formed as both a direct fission product and a decay 
product of I 135 , half-life, 6.7 hours. The half-life of Xe 135 is 9.2 hours. As Xe 135 
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strongly absorbs thermal neutrons thereby “poisoning” the nuclear reactor, its 
concentration is a matter of great interest. The relevant equations are 

dN | AT ^ AT 

-^r = ym f N v - a,n„ 
dN 

— jjT = 2,N , + y x ipa f N v - A X N X - (pa x N x . 

Here iV, = concentration of I 135 (Xe 135 , U 235 ). Assume N v = constant. 

7! = yield of I 135 per fission = 0.060, 

y x = yield of Xe 135 direct from fission = 0.003, 

= I 135 (Xe 135 ) decay constant = — — = — — 

t m h/i 

o f = thermal neutron fission cross section for U 2 , 

er x = thermal neutron absorption cross section for Xe 135 = 3.5 x 10 6 barns, 

= 3.5 x 1(T 18 cm 2 . 

{p j, the absorption cross section of I 135 is negligible.) 

(p = neutron flux = neutrons/cm 3 x mean velocity (cm/sec) 

(a) Find N x (t) in terms of neutron flux cp and the product o f N v . 

(b) Find N x (t oo). 

(c) After N x has reached equilibrium, the reactor is shut down, (p = 0. Find 
N x (t) following shut down. Notice the increase in N x , which may for a few 
hours interfere with starting the reactor up again. 


15.10 OTHER PROPERTIES 


Substitution 

If we replace the parameter 5 by s — a in the definition of the Laplace trans- 
form (Eq. 15.99), we have 


f(s - a) = 


= e~ s, e M F{t)dt 


= &{e*F(t)}. 


(15.152) 


Hence the replacement of s with s — a corresponds to multiplying F(t) by e at 
and conversely. This result can be used to good advantage in extending our 
table of transforms. From Eq. 15.107 we find immediately that 


if {e at sin kt} 


k 

(s — a) 2 + /c 2 ’ 


(15.153) 


also 


if {e at cos kt} = S - 5 — —f s > a. 

1 1 (s — a) 2 4- k 2 


EXAMPLE 15.10.1 Damped Oscillator 

These expressions are useful when we consider an oscillating mass with 
damping proportional to the velocity. Equation 15.129, with such damping 
added, becomes 


mX'\t) + bX\t) + kX(t) = 0, 


(15.154) 
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in which b is a proportionality constant. Let us assume that the particle starts 
from rest at X(0) = X 09 X'(0) = 0. The transformed equation is 

m[s 2 x(s) — sX 0 ] 4- b\_s x(s) — X 0 ~\ 4- k x(s) = 0 (15.155) 

and 


x(s) = X { 


ms 4- b 


°ms 2 4- bs 4- k 

This may be handled by completing the square of the denominator, 


s H s 4 = I s 4- 

m m 


2m 


, k b 2 

+ I y~2 

m 4 m 


(15.156) 


(15.157) 


If the damping is small, b 2 < 4 km, the last term is positive and will be denoted 
by w\. 


x(s) = X 0 

= X 0 


s + b/m 

(s + b/2m) 2 4- Co 2 
s + b/2m 


+ *o 


(b/2mco l )oj 1 


(s + b/2m) 2 + (D 2 ' x ° (s 4" b/2m) 2 + or 


By Eq. 15.153 


where 


X(t) = X 0 e~ W2m)t ( cos 1 + --—sin cm, 
y 2 mo) | 

= X 0 CX ^~e~ ib/2m)t cos^f — (p ), 

CO i 


tan (p = 


(15.158) 


(15.159) 


2 mw ! ’ 

2 k 
u m 

Of course, as b -> 0, this solution goes over to the undamped solution, 
(Section 15.9). 


RLC Analog 

It is worth noting the similarity between this damped simple harmonic 
oscillation of a mass on a spring and an RLC circuit (resistance, inductance, 
and capacitance) (Fig. 15.9). At any instant the sum of the potential differences 
around the loop must be zero (Kirchhoff’s law, conservation of energy). This 
gives 

L^ + R/ + F I ldt = 0. (15.160) 

at C J 

Differentiating the current / with respect to time (to eliminate the integral), 
we have 
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c 


L FIG. 15.9 RLC circuit 

L^ri + R— + —I = 0- (15.161) 

dt 2 dt C 

If we replace /(t) with X{t\ L with m, R with b, C' 1 with /c, Eq. 15.161 is identical 
with the mechanical problem. It is but one example of the unification of diverse 
branches of physics by mathematics. A more complete discussion will be found 
in Olson’s book. 1 


R 

AA/V 


\sms 


Translation 

This time let f{s) be multiplied by e~ b \ b > 0. 


e _ e 


;t F{t)dt 


e~ s(t+b) F(t)dt 


Jo 

Now let t + b = t. Equation 15.162 becomes 


(15.162) 


e bs f(s ) 


e sr F ( t — b)dr 


^*00 

e~ sx F( t — b)u ( t — b)dz , 
o 


(15.163) 


where m(t — b) is the unit step function. This relation is often called the “Heavi- 
side shifting theorem” (Fig. 15.10). 


F(t) 




FIG. 15.10 Translation 


H. F. Olson, Dynamical Analogies. New York: Van Nostrand (1943). 
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Since F(t ) is assumed to be equal to zero for t < 0, F( x — b) = 0 for 0 < t < b. 
Therefore we can extend the lower limit to zero without changing the value of 
the integral. Then, noting that t is only a variable of integration, we obtain 

e~ bs f(s) = &{F(t - b)}. (15.164) 

EXAMPLE 15.10.2 Electromagnetic Waves 


The electromagnetic wave equation with E = E y or E z , a transverse wave 
propagating along the x-axis. is 

d 2 E(x,t) 1 d 2 E(x,t) 


dx 2 ~ V 2 dt 2 a 
Transforming this equation with respect to t, we get 

A: S?{£(x,t)} - + 4 £ (*»°) + A 5£(X ’° 

OX V V V 


dt 


(15.165) 


= 0. (15.166) 


r = 0 


If we have the initial condition E(x, 0) = 0 and 

8E(x, t) 


dt 


= 0 , 


f=0 


then 


-^ 2 &{E{x,t)}=^X{E(x,t)}. (15.167) 

The solution (of this ordinary differential equation) is 

S£{E(x, t)} = c ie - islv)x + c 2 e +is/v)x . (15.168) 

The “constants” c x and c 2 are obtained by additional boundary conditions. 
They are constant with respect to x but may depend on s. If our wave remains 
finite as x oo, ££ {E(x, 0} will also remain finite. Hence c 2 = 0. 

If E( 0, t) is denoted by F(t), then c x = f(s) and 

JS?{£(x, t)} = e~ is/v)x f{s). (15.169) 

From the translation property (Eq. 15.164) we find immediately that 


(15.170) 

x 

v‘ 

Differentiation and substitution into Eq. 15.165 verifies Eq. 15.170. Our solution 
represents a wave (or pulse) moving in the positive x-direction with velocity v. 
Note that for x > vt the region remains undisturbed; the pulse has not had 
time to get there. If we had wanted a signal propagated along the negative 
x-axis, Cj would have been set equal to 0 and we would have obtained 


£(x,0= < 


Fit-- 


0 , 
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E(x, t) = 


F [t + 


0 , 


t > 


x 

t < — , 
V 


(15.171) 


a wave along the negative x-axis. 

Derivative of a Transform 

When F(t\ which is at least piecewise continuous, and 5 are chosen so that 
e~ st F(t) converges exponentially for large 5, the integral 


e st F{t)dt 


is uniformly convergent and may be differentiated (under the integral sign) 
with respect to s. Then 


/'(*)= (~t)e' st F(t)dt = J?{-tF(t)}. 


(15.172) 


Continuing this process, we obtain 

f n Xs)=<?{(-t) n F(t)}. (15.173) 

All the integrals so obtained will be uniformly convergent because of the 
decreasing exponential behavior of e~ st F(t). 

This same technique may be applied to generate more transforms. For 
example, 


JS? {e kt } = 


e st e kt dt 


1 


(15.174) 


s > k. 


s — k 

Differentiating with respect to s (or with respect to /c), we obtain 

1 


£F{te kt } 


(s - k) 2 


s > k. 


(15.175) 


EXAMPLE 15.10.3 Bessel’s Equation 

An interesting application of a differentiated Laplace transform appears in 
the solution of Bessel’s equation with n = 0. From Chapter 11 we have 

x 2 y"{x) + xy'(x) + x 2 y(x) = 0. (15.176) 

Dividing by x and substituting t = x and F(t) = y(x) to agree with the present 
notation, we see that the Bessel equation becomes 

tF”(t) + F\t) + tF(t) = 0. (15.177) 

We need a regular solution, in particular, F(0) — 1. From Eq. 15.177 with t — 0, 
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F'( + 0) = 0. Also, we assume that our unknown F(t) has a transform. Then, 
transforming and using Eqs. 15.124 and 15.172, we have 

-£[s 2 /(s) - 5] + sfts) - 1 - £ f(s) = 0. (1 5. 1 78) 

Rearranging Eq. 15.178, we obtain 

(s 2 + l)f(s) + s/(s) = 0 (15.179) 


or 


df sds 
J~ ~s 2 + V 

a first-order differential equation. By integration, 

In f(s) = -iln(s 2 + 1) + In C, 
which may be rewritten as 


(15.180) 


(15.181) 


f(s) = 


C 

yfs 2 + 1 


(15.182) 


To make use of Eq. 15.108, we expand f(s) in a series of negative powers of s, 
convergent for s > 1 : 





1*3 

2 2 • 2 ! s 4 


■ (~ l)”(2w)! 
(2"n!) 2 s 2n 



(15.183) 


Inverting, term by term, we obtain 


m = c £ 


n = 0 


(-Iff 2 " 
(2"n !) 2 ' 


(15.184) 


When C is set equal to 1, as required by the initial condition F( 0) = 1, F(t) is 
just ./,,(/ ). our familiar Bessel function of order zero. Hence 


2{J 0 (t)}=-r=- (15.185) 

Vs 2 + 1 

Note that we assumed s > 1. The proof for s > 0 is left as a problem. 

It is perhaps worth noting that this application was successful and relatively 
easy because we took n = 0 in Bessel’s equation. This made it possible to divide 
out a factor of x (or t). If this had not been done, the terms of the form t 2 F(t) 
would have introduced a second derivative of /(s). The resulting equation 
would have been no easier to solve than the original one. 

When we go beyond linear differential equations with constant coefficients, 
the Laplace transform may still be applied, but there is no guarantee that it 
will be helpful. 
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The application to Bessel’s equation, n j=- 0, will be found in the references. 
Alternatively, we can show that 


J?{J n (at)} = 


a~ n (yjs 2 -fa 2 — 5)" 


Is 2 + a 2 


(15.186) 


by expressing J n (t ) as an infinite series and transforming term by term. 

Integration of Transforms 

Again, with F(t ) at least piecewise continuous and x large enough so that 
e~ xt F(t) decreases exponentially (as x -> oo), the integral 


f(x)= e xt F(t)dt 


(15.187) 


is uniformly convergent with respect to x. This justifies reversing the order of 
integration in the following equation: 


(15.188) 


f{x)dx = I e xt F(t)dtdx 


_ „~bt\ 


L (e~ st - e~ bt )dt, 


on integrating with respect to x. The lower limit s is chosen large enough so that 
f(s) is within the region of uniform convergence. Now letting b -► oo, we have 


f{x)dx = ^e~ s, dt 
Jo * 


(15.189) 


provided that F(t)/t is finite at t = 0 or diverges less strongly than t 1 (so that 
&{F(t)/t} will exist). 


Limits of Integration — Unit Step Function 

The actual limits of integration for the Laplace transform may be specified 
with the (Heaviside) unit step function 


u(t — k) = 


0, t < k 

1, t > k. 


For instance, 


£?{u(t — k)} = e st dt 


A rectangular pulse of width k and unit height is described by F(t) = 
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u(t) — u(t — k). Taking the Laplace transform, we obtain 


&{u(t) — u(t — k)} = 



dt 


= -(1 - e - ' 15 ). 
5 


The unit step function is also used in Eq. 15.163 and could be invoked in 
Exercise 15.10.13. 


EXERCISES 


15.10.1 


15.10.2 


Solve Eq. 15.154, which describes a damped simple harmonic oscillator for 
X(0) = X 0 , X'(0) - 0, and 

(a) b 2 = 4 km (critically damped), 

(b) b 2 > 4 km (overdamped). 


ANS. 


(a) 


X(t) = X 0 e~ {b/2m)t 



Solve Eq. 15.154, which describes a damped simple harmonic oscillator for 
X(0) = 0, X'(0) = v 0 , and 

(a) b 2 < 4 km (underdamped), 

(b) b 2 — 4 km (critically damped), 

ANS. (a) X(t) = sinw.t, 

w, 

(b) X(t) = v 0 te- (b,2m) '. 

(c) b 2 > 4 km (overdamped). 


15.10.3 


The motion of a body falling in a resisting medium may be described by 


m 


d 2 X(t) 

dt 2 


= mg — b 


dx(t) 

dt 


when the retarding force is proportional to the velocity. Find X (f) and dX(t)/dt 
for the initial conditions 


= 0 . 

f = 0 

15.10.4 Ringing circuit. In certain electronic circuits resistance, inductance, and 
capacitance are placed in the plate circuit in parallel (Fig. 15.11). A constant 
voltage is maintained across the parallel elements, keeping the capacitor 
charged. At time t ~ 0 the circuit is disconnected from the voltage source. 


X(0) = 


dX 

dt 



FIG. 15.11 Ringing circuit 
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15.10.5 

15.10.6 

15.10.7 

15.10.8 


15.10.9 

15.10.10 


Find the voltages across the parallel elements R , L, and C as a function of 
time. Assume R to be large. 

Hint. By Kirchhoff ’s laws 

Ir + Ic + II = 0 and E R ~ E c ~ E L , 


where 


E c = — + — 

C C 


I c dt, 



q 0 — initial charge of capacitor. 

With the DC impedance of L = 0, let I L ( 0) = I 0 , E L ( 0) = 0. This means q 0 = 0. 


With J 0 (t) expressed as a contour integral, apply the Laplace transform 
operation, reverse the order of integration, and thus show that 

if {J o (0} = (s 2 T 1)~ 1/2 , for s > 0. 

Develop the Laplace transform of J n (t) from &{J 0 (t)} by using the Bessel 
function recurrence relations. 

Hint. Here is a chance to use mathematical induction. 


A calculation of the magnetic field of a circular current loop in circular cylin- 
drical coordinates leads to the integral 

1*00 

e~ kz kJi(ka) dk, &(z) > 0. 

Show that this integral is equal to a/(z 2 + a 2 ) 2,12 . 


The electrostatic potential of a point charge q at the origin in circular cylin- 
drical coordinates is 


< l 

4ne 0 


(* oc 

Jo 


e ‘ kz J 0 (kp)dk = 


_q 1 

4n& 0 (p 2 + z 2 ) 1/2 ’ 


&{z) > 0. 


From this relation show that the Fourier cosine and sine transforms of J 0 {kp) 
are 


(a) J^F c {J 0 (kp)} = 

(b) J^F s {J 0 (kp)} = 

Hint. Replace z by z + if and take the limit as z -* 0. 
Show that 

if {/ 0 (af)} = (s 2 - a 2 y l/2 , s> a. 


J 0 (kp) cos k^dk — 
J 0 (kp) sin k£dk — 


(p 2 - t 2 r 112 , 

0, 

0, 

(t 2 -p 2 r 12 . 


P>L 

p < C 

p > c, 

p < £• 


Verify the following Laplace transforms: 

(a) &{j 0 (at)} = se j— } = -cor 1 (-), 

l at j a \aj 

(b) if {n 0 (at)} does not exist, 
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15.10.11 


15.10.12 


15.10.13 


15.10.14 


15.10.15 


15.10.16 


(0 *{<.<«)}- *{=£*} 


1 , s + a 

— In 

2 a s - a 


= - coth 1 
a 



(d) £F{k 0 {at)} does not exist. 


Develop a Laplace transform solution of Laguerre’s equation 
tF"(t) + (1 - t)F'(t) + nF(t) = 0. 

Note that you need a derivative of a transform and a transform of derivatives. 
Go as far as you can with n — n\ then (and only then) set n — 0. 


Show that the Laplace transform of the Laguerre polynomial L n (at) is given 
by 

^{L„(at)} = s > 0. 

Show that 

SeiE^t)} =iln(s+ 1), s > 0, 
s 


where 

re-'<h f 00 

— -I. 

E t (t) is the exponential-integral function. 

(a) From Eq. 15.189 show that 


-dx. 


f(x)dx = 


provided the integrals exist. 

(b) From the preceding result show that 


/»oc 

Jo 


m 


dt y 


sin t . n 

dt = - 

t 2 


in agreement with Eqs. 15.122 and 7.41. 
(a) Show that 




cot 


fsin/cr 

n 

(b) Using this result (with k = 1), prove that 

1 


where 


«(t) = 


li’ {.si(r) ! = — tan 1 s, 
s 


dx, the sine integral. 


If F(t) is periodic (Fig. 15.12) with a period a so that F(t + a) = F(t) for all 
t > 0, show that 
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m 



FIG . 15.12 Periodic function 


se{F(t)} = 


e~*F(t)dt 


with the integration now over only the first period of F(t ). 


1 5.1 0.1 7 Find the Laplace transform of the square wave (period a) defined by 




0 < t < a/2 
a/2 <t< a. 

ANS . 



1 - e' as/2 

1 - ' 


15 . 10.18 Show that 

s 3 

(a) *£ {cosh at cos at} — -r , 

s -f 4 a 

... rg3 , . . . as 2 4- 2a 3 

(b) jSf {cosh at sin at} = , 

(c) J^{sinh at cosat} = ^ 

s* + 4a* 

(d) ^{sinhaf sinar) = 

s* + 4a* 


15.10.19 Show that 


(a) 

.S? '{(s 2 + a 2 ) 2 } 

1 . 1 
= — jsinat — —7 t cosat , 
2a 3 2a 2 

(b) 

■ST 1 ^ 2 + a 2 )' 2 } 

1 . 

= —t sin at, 

2 a 

(c) 

JSr‘{s 2 (s 2 + a 2 )' 2 } 

1 . 1 
= — sin at + -tcosat, 

2 a 2 

(d) 

iT^sV +a 2 ) -2 } 

a 

— cos at 1 sin at. 

2 
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1 5 . 1 0. 20 Show that 

J ¥{{t 2 - k 2 y 1/2 u(t - k)} = K 0 (ks). 

Hint. Try transforming an integral representation of K 0 (ks) into the Laplace 
transform integral. 

1 5 . 1 0. 21 The Laplace transform 

'00 

>0 e ~ x ° XJo {x)dx = VTW 2 

may be rewritten as 

i r°° 

which is in Gauss-Laguerre quadrature form. Evaluate this integral for 
s = 1.0, 0.9, 0.8, . . . decreasing s in steps of 0.1 unitl the relative error rises to 
10 percent. (The effect of decreasing s is to make the integrand oscillate more 
rapidly per unit length of y, thus decreasing the accuracy of the numerical 
quadrature.) 

15.10.22 (a) Evaluate 



by the Gauss-Laguerre quadrature. Take a = 1 and z = 0.1 (0.1) 1.0. 

(b) From the analytic form, Exercise 15.10.7, calculate the absolute error 
and the relative error. 


15.11 CONVOLUTION OR FALTUNG THEOREM 

One of the most important properties of the Laplace transform is that given 
by the convolution or faltung theorem. 1 We take two transforms 

f 1 (s)^^{F 1 {t)} and f 2 (s) = <?{F 2 (t)} (15.190) 

and multiply them together. To avoid complications when changing variables, 
we hold the upper limits finite: 

r*a r*a~x 

Ms)- f 2 (s) = \im \ e~ sx F 1 (x)dx e^F 2 (y)dy. (15.191) 
a ~*°° Jo Jo 

The upper limits are chosen so that the area of integration, shown in Fig. 15.13a, 
is the shaded triangle, not the square. If we integrate over a square in xy-plane, 
we have a parallelogram in the te-plane, which simply adds complications. 
This modification is permissible because the two integrands are assumed to 
decrease exponentially. In the limit a -> oo the integral over the unshaded 
triangle will give zero contribution. Substituting x = t — z, y = z the region 


alternate derivation employs the Bromwich integral (Section 15.12). 
This is Exercise 15.12.3. 
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FIG. 15.13 Change of variables, (a) xy-plane (b) zf-plane 


of integration is mapped into the triangle shown in Fig. 15.13b. To verify the 
mapping, map the vertices: t = x + y, z = y. Using Jacobians to transform the 
element of area, we have 


dx dy = 


8x dy 
dt dt 
dx dy 
dz dz 


dtdz — 


1 0 

-1 1 


dtdz 


or dxdy = dtdz. With this substitution Eq. 15.191 becomes 


F^t — z)F 2 (z)dzdt 


fi(s)-f 2 (s) = lim e 


= ^|J F l (t-z)F 1 {z)dz\ 

For convenience this integral is represented by the symbol 

Fi(t - z)F 2 (z)dz ~ F x *F 2 


(15.192) 


(15.193) 


(15.194) 


Jo 

and referred to as the convolution, closely analogous to the Fourier convolution 
(Section 15.5). If we substitute w = t — z, we find 


*F 2 = F 2 *F 1? (15.195) 

showing that the relation is symmetric. 

Carrying out the inverse transform, we also find 

= [ Fi(f - z)F 2 (z)dz. (15.196) 

Jo 

This can be useful in the development of new transforms or as an alternative to 
a partial fraction expansion. One immediate application is in the solution of 
integral equations (Section 16.2). Since the upper limit t is variable, this Laplace 
convolution is useful in treating Volterra integral equations. The Fourier 
convolution with fixed (infinite) limits would apply to Fredholm integral 
equations. 
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EXAMPLE 15.11.1 Driven Oscillator with Damping 


As one illustration of the use of the convolution theorem, let us return to 
the mass m on a spring, with damping and a driving force F(t). The equation 
of motion (15.129) now becomes 

mX"{t) + bX\t) + kX(t) = F(t ). (15.197) 

Initial conditions X(0) = 0, X'(0) = 0 are used to simplify this illustration, 
and the transformed equation is 

ms 2 x(s) + bs x(s) + k .x(s) = f(s) (15.198) 


or 


, , f(s) 1 

X,S) - * (TH ,/2rf + <of 

where cof = k/m — b 2 /4m 2 , as before. 

By the convolution theorem (Eq. 15.193 or 15.196), 

X(t) = — — I F(t — z)e~ {b,2m)z sina) l z dz. 

mco.Jo 

If the force is impulsive, F(t) = PS(t) 2 

X{t) = — g-w^sinw^. 
mo j 


(15.199) 


(15.200) 


(15.201) 


P represents the momentum transferred by the impulse and the constant P/m 
takes the place of an initial velocity X'(0). 

If F(t) = F 0 sinojt , Eq. 15.200 may be used, but a partial fraction expansion 
is perhaps more convenient. With 


f(s) = 


F qQJ 
s 2 + to 2 


Eq. 15.199 becomes 


x(,) = F ^x 


1 


1 


m 


s 2 + or (s + b/2m) 2 + co 2 


_ F 0 oj a 
m s 2 


5 + fr' 


+ 


c s d 


(15.202) 


+ a/ (s + b/2m) 2 + m 2 _ 

The coefficients a\ b\ d r , and d' are*independent of s. Direct calculation shows 

, b 2 - TYl / 2 2\2 

a = — co + — (ojq — ar) z , 
m b 


m , 


b 1 = - a) 2 ) 


b 2 , m , 2 2,2 

— o) + -r(u>o — or) 
m b 


Note that <5(0 lies inside the interval [0, /]. 
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Since c f and d' will lead to exponentially decreasing terms (transients), they 
will be discarded here. Carrying out the inverse operation, we find for the 
steady-state solution 


where 


X(t) = 


lb 2 co 2 + m 2 (col — co 2 ) 2 ] 1/2 


sin (cot — (p ), 


(15.203) 


tan cp — 


bv) 


m(o)Q — co 2 ) 


Differentiating the denominator, we find that the amplitude has a maximum 
when 


2 2 2 b 2 

OJ = COq — - — y = COT — - — 3-. 

0 2 m 2 1 4m 2 


(15.204) 


This is the resonance condition. 3 At resonance the amplitude becomes F 0 /ba) u 
showing that the mass m goes into infinite oscillation at resonance if damping 
is neglected ( b = 0). It is worth noting that we have had three different charac- 
teristic frequencies : 


CO 2 ~ COo “ 


b 2 

2m 2 ’ 


resonance for forced oscillations, with damping, 


22 b 2 

0,1 - 0,0 "S?* 


free oscillation frequency, with damping, 



free oscillation frequency, no damping. They coincide only if the damping is 
zero. 

Returning to Eqs. 15.197 and 15.199, Eq. 15.197 is our differential equation 
for the response of a dynamical system to an arbitrary driving force. The final 
response clearly depends on both the driving force and the characteristics of 
our system. This dual dependence is separated in the transform space. In 
Eq. 15.199 the transform of the response (output) appears as the product of 
two factors, one describing the driving force (input) and the other describing 
the dynamical system. This latter part, which modifies the input and yields 
the output, is often called a transfer function . Specifically, [(s + b/2m) 2 + to 2 ]' 1 
is the transfer function corresponding to this damped oscillator. The concept 
of a transfer function is of great use in the field of servomechanisms. Often the 


3 The amplitude (squared) has the typical resonance denominator, the Lorentz 
line shape, Exercise 15.3.9. 
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characteristics of a particular servomechanism are described by giving its 
transfer function. The convolution theorem then yields the output signal for 
a particular input signal. 


EXERCISES 


15.11.1 From the convolution theorem show that 


where f(s) = J?{F(t)}. 


~f(s) = & 

s 


‘ F(x)dx 1, 


15.11.2 If F(t) = t a and G(i) = t\ a > - I, b > - 1 

(a) Show that the convolution 

F*G = t a+b+1 J*' /(I -yfdy. 

(b) By using the convolution theorem, show that 


* 0 


/(l -yfdy = 


albl 


(a + b+ 1)! 


When replacing a by a — 1 and b by b — 1, we have the Euler formula for the 
beta function (Eq. 10.60). 


1 5.1 1 .3 Using the convolution integral, calculate 


jsr 1 


{s 2 + a 2 )(s 2 + b 2 ) 


a 2 ^ b 2 . 


1 5.1 1 .4 An undamped oscillator is driven by a force F 0 sin cot. Find the displacement as 
a function of time. Notice that it is a linear combination of two simple harmonic 
motions, one with the frequency of the driving force and one with the frequency 
co 0 of the free oscillator. (Assume X(0) = AT'(O) = 0). 

^ x Fn/m / co . \ 

ANS. X(t) = — jl — sin o) 0 t - sin cot . 

co — co 0 \ co 0 ) 

Other exercises involving the Laplace convolution appear in Section 16.2. 


15.12 INVERSE LAPLACE TRANSFORMATION 

Brofriwich Integral 

We now develop an expression for the inverse Laplace transform, 
appearing in the equation 

F(t) = J?~ 1 {f(s)}. (15.205) 

One approach lies in the Fourier transform for which we know the inverse 
relation. There is a difficulty, however. Our Fourier transformable function 
had to satisfy the Dirichlet conditions. In particular, we required that 

lim G(co) = 0 


(15.206) 
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so that the infinite integral would be well defined. 1 Now we wish to treat 
functions, F(t\ that may diverge exponentially. To surmount this difficulty, 
we extract an exponential factor, e y \ from our (possibly) divergent Laplace 
function and write 


F(t) = e yt G{t). 


(15.207) 


If F(t) diverges as e a \ we require y to be greater than a so that G(t ) will be 
convergent. Now, with G(t) = 0 for t <0 and otherwise suitably restricted so 
that it may be represented by a Fourier integral (Eq. 15.20), 


G(t) = 2 ^ 

J ~ 00 


e iut du G(v)e~ iuv dv . 


Using Eq. 15.207, we may rewrite (15.208) 


o 

as 


e yt r 
W -5j. 


e iut du 


F(v)e~ yv e~ iuv dv . 


Now with the change of variable, 

s = y + iu, 

the integral over v is thrown into the form of a Laplace transform 

F(v)e~ sv dv =/(s); 


(15.208) 


(15.209) 


(15.210) 


(15.211) 


s is now a complex variable and J'l.v) > y to guarantee convergence. Notice that 
the Laplace transform has mapped a function specified on the positive real axis 
onto the complex plane, M{s) > y. 2 

With y as a constant, ds = idu. Substituting Eq. 15.211 into Eq. 15.209, we 
obtain 


F(t) = 


1 

2 ni 


V+ioo 


e st f(s) ds. 


(15.212) 


Here is our inverse transform. We have rotated the line of integration through 
90° (by using ds = idu). The path has become an infinite vertical line in the com- 
plex plane, the constant y having been chosen so that all the singularities of f(s) 
are on the left-hand side (Fig. 15.14). 

Equation 15.212, our inverse transformation, is usually known as the 
Bromwich integral, although sometimes it is referred to as the Fourier-Mellin 
theorem or Fourier-Mellin integral. This integral may now be evaluated by the 
regular methods of contour integration (Chapter 7). If t > 0, the contour may be 
closed by an infinite semicircle in the left half-plane. Then by the residue theorem 
(Section 7.2) 


1 If delta functions are included, G(a > ) may be a cosine. Although this does 
not satisfy Eq. 15.206, G(co) is still bounded. 

2 For a derivation of the inverse Laplace transform using only real variables 
see C. L. Bohn and R. W. Flynn, “Real Variable Inversion of Laplace Trans- 
forms: An Application in Plasma Physics.” Am. J. Phys. 46 , 1250 (1978). 
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FIG. 15.14 Singularities of e st f(s) 


F(t ) = £ (residues included for &(s) < y). (15.213) 

Possibly this means of evaluation with $(s) ranging through negative values 
seems paradoxical in view of our previous requirement that 3#(s) > y. The 
paradox disappears when we recall that the requirement M(s) > y was imposed 
to guarantee convergence of the Laplace transform integral that defined /(s). 
Once f(s) is obtained, we may then proceed to exploit its properties as an analy- 
tical function in the complex plane wherever we choose. 3 In effect we are 
employing analytical continuation to get &{F(t)} in the left half-plane exactly 
as the recurrence relation for the factorial function was used to extend the Euler 
integral definition (Eq. 10.5) to the left half-plane. 

Perhaps a pair of examples may clarify the evaluation of Eq. 15.212. 


EXAMPLE 15.12.1 Inversion via Calculus of Residues 


If/(s) =. a/(s 2 — a 2 ), then 

e «fls) = - g gS L- = ^ . 

s 2 — a 2 (s + a)(s — a) 


(15.214) 


The residues may be found by using Exercise 7.1.1 or various other means. The 
first step is to identify the singularities, the poles. Here we have one simple pole 
at s = a and another simple pole at s = — a. By Exercise 7.1.1 the residue at 
s = a is { \)e at and the residue at s = —a is ( — j)e~ at . Then 

Residues — (i)(e ar — e~ at ) = sinhnf = F(t) (15.215) 


in agreement .with Eq. 15.105. 


EXAMPLE 15.12.2 


If 


m = 


i 



3 In numerical work f(s) may well be available only for discrete real, positive 
values of 5. Then numerical procedures are indicated. See Section 15.8 and 
the reference to Krylov and Skoblya. 
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then we have 



(15.216) 


The first term on the right has a simple pole at s = 0, residue = 1. Then by Eq. 
15.213 


F 1 (t) = 


t > 0, 
t < 0, 


(15.217) 


= u(t), 


where u(t) is the unit step function. Neglecting the minus sign and the e~ a \ we 
find that the second term on the right also has a simple pole at s = 0, residue = 1. 
Noting the translation property (Eq. 15.164), we have 



t — a > 0, 
t — a < 0, 


(15.218) 


= u(t — a). 


Therefore 

( 0, t < 0, 

F(t) = F t {t) - F 2 {t) = < 1, 0 < t < a, (15.219) 

( 0, t > a , 

= u(t) — u(t — a) 

a step function of unit height and length a (Fig. 15.15). 


m 




t 


t = a 


FIG. 15.15 Finite- length step function 
u{t) — u{t — a) 


Two general comments may be in order. First, these two examples hardly be- 
gin to show the usefulness and power of the Bromwich integral. It is always avail- 
able for inverting a complicated transform when the tables prove inadequate. 

Second, this derivation is not presented as a rigorous one. Rather, it is given 
more as a plausibility argument, although it can be made rigorous. The deter- 
mination of the inverse transform is somewhat similar to the solution of a 
differential equation. It makes little difference how you get the solution. Guess 
at it if you want. The solution can always be checked by substitution back into 
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the original differential equation. Similarly, F(t) can (and, to check on careless 
errors, should) be checked by determining whether by Eq. 15.99 

=f(s). 

Two alternate derivations of the Bromwich integral are the subjects of Exercises 
15.12.1 and 15.12.2. 

As a final illustration of the use of the Laplace inverse transform, we have 
some results from the work of Brillouin and Sommerfeld (1914) in electromag- 
netic theory. 

EXAMPLE 15.12.3 Velocity of Electromagnetic Waves in a Dispersive 
Medium 

The group velocity u of traveling waves is related to the phase velocity v by 
the equation 

u = v- A (15.220) 

dX 

Here X is the wavelength. In the vicinity of an absorption line (resonance) dv/dX 
may be sufficiently negative so that u > c (Fig. 15.16). The question immediately 
arises whether a signal can be transmitted faster than c, the velocity of light in 
vacuum. This question, which assumes that such a group velocity is meaningful, 
is of fundamental importance to the theory of special relativity. 



FIG. 15.16 Optical dispersion 


We need a solution to the wave equation 


d 2 \jj = 1 8 2 il/ 
8x 2 v 2 8t 2 ’ 


(15.221) 


corresponding to a harmonic vibration starting at the origin at time zero. Since 
our medium is dispersive, v is a function of the angular frequency. Imagine, for 
instance, a plane wave, angular frequency a), incident on a shutter at the origin. 
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At t — 0 the shutter is (instantaneously) opened, and the wave is permitted to 
advance along the positive x-axis. 

Let us then build up a solution starting at x = 0. It is convenient to use the 
Cauchy integral formula, Eq. 6.43, 

1 f e~ izt 

= dz = e-'*« 

2ni J z - Zq 


(for a contour encircling 2 = z 0 in the positive sense). Using s = —iz and z 0 — co, 
we obtain 


mt) = 


2ni 


*y+ioo 

Jy-ioo 



S + io) 



t < 0 
t > 0. 


(15.222) 


To be complete, the loop integral is over the vertical line 0l(s) = y and an infinite 
semicircle as shown in Fig. 15.17. The location of the infinite semicircle is chosen 



'<0 


FIG. 15.17 Possible closed contours 
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t> 0 




so that the integral over it vanishes. This means a semicircle in the left half-plane 
for t > 0 and the residue is enclosed. For t < 0 we pick the right half-plane and 
no singularity is enclosed. The fact that this is just the Bromwich integral may 
be verified by noting that 



t < 0, 
t > 0, 


(15.223) 


and applying the Laplace transform. The transformed function f(s) becomes 


f(s) = 


1 

5 4- ioj 


(15.224) 


Our Cauchy-Bromwich integral provides us with the time dependence of a 
signal leaving the origin at t = 0. To include the space dependence, we note that 

e s(t-xlv) 


satisfies the wave equation. With this as a clue, we replace t by t — x/v and write 
a solution 
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ds. (15.225) 

5 + ico 

It was seen in the derivation of the Bromwich integral that our variable s 
replaces the co of the Fourier transformation. Hence the wave velocity v becomes 
a function of s, that is, v(s). Its particular form need not concern us here. We need 
only the property 

lim v(s) = constant, c. (15.226) 

|s|— >00 


l l/(x 9 t) = 


2ni 


This is suggested by the asymptotic behavior of the curve on the right side of 
Fig. 15.16. 4 

Evaluating Eq. 15.225 by the calculus of residues, we may close the path of 
integration by a semicircle in the right half-plane, provided 

x » 

t <0. 

c 


Hence 


4/(x,t) = 0, t — — < 0, (15.227) 

c 

which means that the velocity of our signal cannot exceed the velocity of light in 
vacuum c. This simple but very significant result was extended by Sommerfeld 
and Brillouin to show just how the wave advanced in the dispersive medium. 


Summary — Inversion of Laplace Transform 

1. Direct use of tables, Table 15.2, and references; use 
of partial fractions (Section 15.8) and the operational 
theorems of Table 15.1. 

2. Bromwich integral, Eq. 15.212, and the calculus of 
residues. 

3. Numerical inversion, Section 15.8, and references. 


EXERCISES 

1 5.1 2.1 Derive the Bromwich integral from Cauchy’s integral formula. 
Hint. Apply the inverse transform to 

M-± lim 

2m . s — z 

J y — ia 

where f(z) is analytic for &(z) > y. 


4 Equation 15.226 follows rigorously from the theory of anomalous disper- 
sion. See also the Kronig-Kramers optical dispersion relations of Section 7.3. 
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1 5.1 2.2 Starting with 


15.12.6 


1 fy+ioo 

— e s 'f(s)ds, 


show that by introducing 


f(s) = 


e~ sz F(z)dz , 


15.12.3 


we can convert one integral into the Fourier representation of a Dirac delta 
function. From this derive the inverse Laplace transform. 

Derive the Laplace transformation convolution theorem by use of the Brom- 
wich integral. 


15.12.4 Find 


jsr 


— k“ 


(a) by a partial fraction expansion, 

(b) repeat, using the Bromwich integral. 


15.12.5 Find 


jsr 


k 2 


s(s 2 + k 2 ) 


(a) by using a partial fraction expansion, 

(b) repeat using the convolution theorem, 

(c) repeat using the Bromwich integral. 


ANS. F(t) = 1 — cos kt. 


Use the Bromwich integral to find the function whose transform is /(s) — s~ 1/2 . 
Note that f(s ) has a branch point at s = 0. The negative x-axis may be taken 
as a cut line. 

ANS. F(t) = (7it)" 1/2 . 


15.12.7 Show that 

£?~ l {(s 2 + 1)~ 1/2 } = Jo(t) 
by evaluation of the Bromwich integral. 

Hint. Convert your Bromwich integral into an integral representation of J 0 (t). 
Figure 15.18 shows a possible contour. 

15.12.8 Evaluate the inverse Laplace transform 

<F~ l {(s 2 — a 2 )~ m } 

by each of the following methods: 

(a) Expansion in a series and term-by-term inversion, 

(b) Direct evaluation of the Bromwich integral, 

(c) Change of variable in the Bromwich integral: s = ^(z + z' 1 ). 


15.12.9 Show that 



— In f — y, 


where y = 0.5772. . . , the Euler-Mascheroni constant. 
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FIG. 1 5 . 1 8 A possible contour for the inver- 
son of J 0 (t) 


1 5.1 2.1 0 Evaluate the Bromwich integral for 


f(s) = 


s 

( 5 2 -f a 2 ) 2 * 


1 5.1 2.1 1 Heaviside expansion theorem. If the transform f(s) may be written as a ratio 



where g(s) and h(s) are analytic functions, h(s) having simple, isolated zeros 
at s = s i9 show that 


F(t) = JST 1 



g(£i) 

h'(s,) 


e'i'. 


Hint. See Exercise 7.1.2. 


1 5.1 2.1 2 Using the Bromwich integral, invert f(s ) = s 2 e ks . Express F(t) = if 1 {/(s)} 
in terms of the (shifted) unit step function u(t — k). 

ANS. F.(t) = (t - k)u(t - k). 


15 . 12.13 You have a Laplace transform : 


f(s) = 


1 

(5 + a)(s + b)' 


a b. 


Invert this transform by each of three methods: 

(a) Partial fractions and use of tables, 

(b) Convolution theorem, 

(c) Bromwich integral. 


ANS. 


F(t) = 



a - b 
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TABLE 15.1 Laplace Transform Operations 


1. Laplace transform 

2. Transform of derivative 

3. Transform of integral 

4. Substitution 

5. Translation 

6. Derivative of transform 

7. Integral of transform 

8. Convolution 

9. Inverse transform, 

Bromwich integral 


Operations 


Equation 


f(s) = & {F(t)} - e~ st F(t)dt (15.99) 

Jo 

sf(s ) - F( + 0) = &{F\t)} (15.123) 

s 2 /(s) - sF( T 0) - F'(+0) - 


= F(x)dx^ 

f(s -a) = &{e*F(t)} 
e~ bs As ) = J?{F(t - b )} 


(Exercise 15.11.1) 

(15.152) 

(15.164) 

(15.173) 






F\(t — z)F 2 (z)dz 


2ni 


'y+ico 


e s f(s) ds = F(t) 


(15.189) 

(15.193) 

(15.212) 
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TABLE 15.2 Laplace Transforms 



m 

m 

Limitation Equation 

1. 

1 

<5(0 

Singularity at +0 (15.141) 

2. 

1 

s 

1 

s > 0 

(15.102) 

3. 

n\ 

r.n + 1 

t n 

s > 0 

(14.108) 


s 


n> -1 


4. 

1 

S — k 

e kt 

s> k 

(15.103) 

5. 

1 

te kt 

s > k 

(15.175) 

(s - kf 

6. 

s 

cosh kt 

s> k 

(15.105) 

s 2 -k 2 

7. 

k 

sinh kt 

s > k 

(15.105) 

s 2 — k 2 

8. 

s 

cos kt 

s > 0 

(15.107) 

s 2 4- k 2 

9. 

k 

sin kt 

s > 0 

(15.107) 

s 2 4- k 2 

10. 

s — a 

e at cos kt 

s > a 

(15.153) 

(s — a) 2 4- k 2 

11. 

k 

e at sin kt 

s > a 

(15.153) 

(5 — a) 2 4- k 2 

12. 

s 2 — k 2 
(s 2 + k 2 ) 2 

t cos kt 

s> 0 

(15.172) 

13. 

2ks 

t sin kt 

s > 0 

(15.172) 

( 5 2 4- k 2 ) 2 

14. 

(s 2 + a 2 )~ 1/2 

J 0 (at) 

s > 0 

(15.185) 

15. 

(s 2 —* a 2 )~ 1/2 

Io(at) 

s > a 

(Exercise 15.10.10) 

16. 

^ cor '@ 

Jo(at) 

s > 0 

(Exercise 15.10.11) 


! ln S + a ] 




17. 

2 a s — a 

icoth-^y 

a \aj 

■ i 0 (at ) 

s> a 

(Exercise 15.10.11) 

18. 

(s — a) n 

5" + 1 

LM) 

s> 0 

(Exercise 15.10.13) 

19. 

Aln(s+1) 

E t (x) = — Ei( — X) 

s > 0 

(Exercise 15.10.14) 

20. 

Ins 

s 

— In t — C 

s > 0 

(Exercise 15.12.9) 


A more extensive table of Laplace transforms appears in Chapter 29 of AMS-55. 
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16 INTEGRAL 
EQUATIONS 


16.1 INTRODUCTION 

With the exception of the integral transforms of the last chapter, we have 
been considering equations with relations between the unknown function cp(x) 
and one or more of its derivatives. We now proceed to investigate equations 
containing the unknown function within an integral. As with differential equa- 
tions, we shall largely confine our attention to linear relations, linear integral 
equations. Integral equations are classified in two ways : 

1. If the limits of integration are fixed, we call the equa- 
tion a Fredholm equation; if one limit is variable, 
it is a Volterra equation. 

2. If the unknown function appears only under the 
integral sign, we label it “first kind.” If it appears 
both inside and outside the integral, it is labeled, 

“second kind.” 


Definitions 

Symbolically, we have Fredholm equation of the first kind: 

f(x)= f K(x,t)<p(t)dt. 

J a 

Fredholm equation of the second kind: 

cp(x) — f(x ) + X f K(x, t)cp(t) dt. 

Ja 

Volterra equation of the first kind: 

f(x)= K(x,t)<p(t)dt. 

Ja 

Volterra equation of the second kind: 

<p(x) =f(x) + I K(x,t)<p(t)dt. 


(16.1) 


(16.2) 


(16.3) 


(16.4) 


In all four cases <p(r) is the unknown function. K(x, t), which we call the kernel, 
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and f(x) are assumed to be known. When f(x) = 0, the equation is said to be 
homogeneous. 

The reader may wonder, with some justification, why we bother about in- 
tegral equations. After all, the differential equations have done a rather good 
job of describing our physical world so far. There are several reasons for intro- 
ducing integral equations here. 

We have placed considerable emphasis on the solution of differential equa- 
tions subject to particular boundary conditions. For instance, the boundary con- 
dition at r = 0 determines whether the Neumann function N n (r) is present when 
Bessel’s equation is solved. The boundary condition for r -> oo determines 
whether the I n (r) is present in our solution of the modified Bessel equation. The 
integral equation relates the unknown function not only to its values at neigh- 
boring points (derivatives) but also to its values throughout a region, including 
the boundary. In a very real sense the boundary conditions are built into the 
integral equation rather than imposed at the final stage of the solution. It will be 
seen later, when we construct kernels (Section 16.5), that the form of the kernel 
depends on the values on the boundary. The integral equation, then, is compact 
and may turn out to be a more convenient or powerful form than the differential 
equation. Mathematical problems such as existence, uniqueness, and complete- 
ness may often be handled more easily and elegantly in integral form. Finally, 
whether or not we like it, there are some problems, such as some diffusion and 
transport phenomena, that cannot be represented by differential equations. If 
we wish to solve such problems, we are forced to handle integral equations. A 
most important example of this sort of physical situation follows. 

EXAMPLE 16.1.1 Neutron Transport Theory — Boltzmann Equation 

The fundamental equation of neutron transport theory is an expression of the 
equation of continuity for neutrons: 

Production = losses + leakage. 

Under production we have sources 

S(v,£l,r)dvdSl 

representing the introduction of S neutrons per cubic centimeter per second with 
speeds between v and v -F dv and direction of motion £1 within a solid angle dSl. 

An additional source is provided by scattering collisions that scatter neutrons 
into the ranges just listed. The rate of scattering is given by 

£(i>, v',Sl,£l')<p(v',£l', r), 

s 

where is the (macroscopic) probability that a neutron of speed v\ direction 
£L, will be scattered with resultant speed i\ direction ft. The quantity cp(v\ ft', r) 
is the neutron flux. Expressed as a vector, <p = flcp has the direction of the 
neutron velocity and a magnitude equal to the number of neutrons per second 
of speed v crossing a unit area at position r and in a direction £1 (Fig. 16.1). 
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Integrating over available initial speeds ( v ' ) and over all directions (£1), we 
obtain 



v\ f l , £l')<p(v\ ft', r )dv' dfl' 


for the second production term. 

Losses come from leakage given by 

V-<p(r, n, r) 

and from absorption and scattering into another (lower) velocity range. These 
are 


2>) + X» 


(p{v, U,r) 


If the medium is not homogeneous and isotropic, the £’s may have position 
and direction-dependence in addition to the indicated speed or energy depen- 
dence. 

Our equation of continuity finally becomes 


Y ( v , v', n, fL, r) dv' dfl' + S(v , r) 


= V *cp(i;,n,r) + 


5>) + K«o 


(16.5) 


<p(v, r). 


This is the steady-state Boltzmann equation, an integro-differential equation. In 
this form the Boltzmann equation is almost impossible to handle. Most of 
neutron transport theory is a development of methods that are compromises 
between physical accuracy and mathematical feasibility. 1 

An integral equation may also appear as a matter of deliberate choice based 
on convenience or the need for the mathematical power of an integral equation 
formulation. 


EXAMPLE 16.1.2 Momentum Representation in Quantum Mechanics 
The Schrodinger equation (in ordinary space representation) is 


Compare H. Soodak, Ed., Reactor Handbook , 2nd ed., vol. Ill, part A, 
Physics. New York: Interscience Publishers (1962). Compare Chapter 3. 
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or 


where 


-^VV(r) + F(.#(r) = E*( r) 
2m 


(-V 2 + a 2 )iMr) = u(r)^(r), 


2 2 TO 


(16.6) 


(16.7) 


fl ~ t.2 


lm 


v(r) = ~V(r). 


We may generalize Eq. 16.7 to 

(-V 2 + a 2 ){//( r) = | v(r,r')il/(r')d 3 r'. 
For the special case of 


t?(r,r') = v(r')S(r — r'), 


(16.8) 


(16.9) 


which represents local interaction Eq. 16.8 reduces to Eq. 16.7. Equation 16.8 
is now subject to the Fourier transform (compare Section 15.6). 


®(k) = 


1 


(2ti) 3/2 

1 




ij/(r)e ik,t d 3 r 
<b(k)e ik r d 3 k. 


(16.10) 


Here the abbreviation 


? = k (wave number) 


(16.11) 


has been introduced. Developing Eq. 16.10, we obtain 

| (-V 2 + a 2 )i^(r)e~' k ' r d 3 r = jj p(r,r)^ )e H ‘VVi( 3 r. (16.12) 

Note that the V 2 on the left operates only on the i^(r). Integrating the left-hand 
side by parts and substituting Eq. 16.10 for ij/(r') on the right, we get 


(i k 2 + a 2 )t/>(r)e tkr d 3 r = (2n) 3/2 (k 2 -I- a 2 )<h(k) 

1 


(2 n) 


3/2 


(16.13) 


u(r, r')0(k')e 


-^■'-^d 3 r' d 3 rd 3 k'. 


If we use 


/( k,k') = 


(2tt) 3 


v(r,r')e- i{k - , - k ‘- r) d 3 r'd 3 r, 


(16.14) 
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Eq. 16.13 becomes 


(j k 2 + a 2 )0(k) = 


/(k,k')0(k')d 3 /c\ 


(16.15) 


a homogeneous Fredholm equation of the second kind in which the parameter 
a 2 corresponds to the eigenvalue. 

For our special but important case of local interaction, application of 
Eq. 16.9 leads to 

/(k,k') = /(k — k'). (16.16) 

This is our momentum representation equivalent to an ordinary static 
interaction potential in ordinary space. Our momentum function <t>(k) satisfies 
the integral equation (Eq. 16.15). It must be emphasized that all through here 
we have assumed that the required Fourier integrals exist. For a linear oscillator 
potential, V(r ) = r 2 , the required integrals would not exist. Equation 16.10 
would lead to divergent oscillations and we would have no Eq. 16.15. 


Transformation of a Differential Equation into 
an Integral Equation 

Often we find that we have a choice. The physical problem may be represented 
by a differential or an integral equation. Let us assume that we have the differ- 
ential equation and wish to transform it into an integral equation. Starting 
with a linear second-order differential equation 

f + A (x)/ + B(x)y = g(x) (16.17) 

with initial conditions 


we integrate to obtain 
/ = - 


y(a) = yo, 
y\a) = y'o, 

C , x 

Ay' dx — By dx + 


gdx + y 0 . 


(16.18) 


Integrating the first integral on the right by parts yields 

/ = -Ay- f (B — A')ydx + [ gdx A A(a)y 0 + j/ 0 . (16.19) 


Notice how the initial conditions are being absorbed into our new version. 
Integrating a second time, we obtain 


y = 


Aydx — 


\_B(t) — A\t)~\y(t)dt dx 


+ 


(16.20) 


g(t) dtdx + [A(a)y 0 + yo](* - a ) + yo- 


To transform this equation into a neater form, we use the relation 
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f(t)dtdx= (x — t)f(t)dt. 


(16.21) 


This may be verified by differentiating both sides. Since the derivatives are equal, 
the original expressions can differ only by a constant. Letting x-+ a, the constant 
vanishes and Eq. 16.21 is established. Applying it to Eq. 16.20, we obtain 


y(x) = - { A(t ) + (x - t)[B(t ) - A'(t)]}y(t)dt 


(16.22) 


+ (x - t)g(t)dt + [A(a)y 0 + / 0 ](x - a) + >’ 0 . 
Ja 

If we now introduce the abbreviations 


K(x,t) = (t- x)[£(t) - A\t)-] - A(t), 


fix) = 


f*X 

(x - t)g(t)dt + [^l(a)>> 0 + yo](x - a) + y 0 , 
Ja 


(16.23) 


Eq. 16.22 becomes 


y(x) = f(x)+ K(x, t)y(t)dt, (16.24) 

J a 

which is a Volterra equation of the second kind. This reformulation as a Volterra 
integral equation offers certain advantages in investigating questions of existence 
and uniqueness. 


EXAMPLE 16.1.3 Linear Oscillator Equation 


As a simple illustration, consider the linear oscillator equation 

f + <n 2 y = 0 (16.25) 

with 

y( 0) = o 
m = i. 

This yields 

A{x) - 0, 

B(x) = oj\ 
g(x) = 0. 


Substituting into Eq. 16.22 (or Equations 16.23 and 16.24), we find that the 
integral equation becomes 


y(x) = x T co 2 


(t — x)y(t)dt. 
o 


(16.26) 


This integral equation, Eq. 16.26, is equivalent to the original differential equa- 
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tion plus the initial conditions. The reader may show that each form is indeed 
satisfied by y(x) = (1/co) sin cox. 

Let us reconsider the linear oscillator equation (16.25) but now with the 
boundary conditions 

y( 0 ) - 0 , 

y(b) = 0. 

Since y'(0) is not given, we must modify the procedure. The first integration 
gives 


y = —co 


ydx + /(0). 


(16.27) 


(16.2S) 


To eliminate the unknown /(0), we now impose the condition y(b) = 0. This 
gives 

rb 


Integrating a second time and again using Eq. 16.21, we have 
y = —co 2 [* (x — t)y(t)dt + y'( 0)x. 


co 


(b — t)y(t)dt = by'( 0). 


Substituting this back into Eq. 16.28, we obtain 


y(x) 


-co" 


(x — t)y(t)dt + co 2 


( b - t)y(t)dt. 


(16.29) 


(16.30) 


Now let us break the interval [0,6] into two intervals [0, x] and [x, 6]. 
Since 


j(b — t) — (x — t) = ~(b - x), 


y(x) = w 2 z (b - x)y(t)dt + co 2 z (b - t)y(t)dt. 


we find 


Finally, if we define a kernel (Fig. 16.2) 

t 

\ 

K(x,t) = 


(b — x), t < x, 


~(6 - t\ x< t, 
b 


y(x) = co 2 K(x, t)y(t)dt, 


we have 


a homogeneous Fredholm equation of the second kind. 


(16.31) 


(16.32) 


(16.33) 


(16.34) 
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K{x, t ) 



FIG. 16.2 


Our new kernel, K(x, t\ has some interesting properties. 

1. It is symmetric, K(x , t) = x). 

2. It is continuous in the sense that 






t) 


t = X 


3. Its derivative with respect to t is discontinuous. As t 
increases through the point t = x, there is a dis- 
continuity of — 1 in dK(x, t)/8t. 


We shall return to these properties in Section 16.5 in which we identify K(x , t) 
as a Green’s function. 

In the transformation of a linear, second-order differential equation into an 
integral equation, the initial or boundary conditions play a decisive role. If we 
have initial conditions (only one end of our interval), the differential equation 
transforms into a Volterra integral equation. For the case of the linear oscillator 
equation with boundary conditions (both ends of our interval), the differential 
equation leads to a Fredholm integral equation with a kernel that will be a 
Green’s function. 

It might be noted that the reverse transformation (integral equation to 
differential equation) is not always possible. There exist integral equations for 
which no corresponding differential equation is known. 


EXERCISES 

16 . 1.1 Starting with the differential equation, integrate twice and derive the Volterra 


integral equation corresponding to 



(a) y"(x) - y(x) = 0; 

V 

o 

II 

p 

3 

1! 

C/5 

V- 

' = j (x — t)y(t)dt + x. 

(b) y"(x) - y(x) = 0; 

y(0) = l. 

/(0)= -1. 

ANS. y = 

Jo 

f (x — t)y(t)dt - x + 1. 


o 


Check your results with Eq. 16.23. 

1 6 . 1 .2 Derive a Fredholm integral equation corresponding to 

fix) ~ y(x) = 0; y( 1) = 1, 

y(-i) = L 
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(a) by integrating twice, 

(b) by forming the Green’s function. 


ANS . 


Ax) = 1 - 


K(x, t)y(t ) dt, 


K{x,t) 


j(l - x)(t +1), X > £, 

j(l - f)(x +1), X < t. 


16 . 1.3 


(a) Starting with the given answers of Exercise 16.1.1, differentiate and recover 
the original differential equations and the boundary conditions. 

(b) Repeat for Exercise 16.1.2. 


16.1 .4 The general second-order linear differential equation with constant coefficients 
is 

f(x) + ci 1 y'(x) + a 2 y(x) = 0. 

Given the boundary conditions 

y( 0) = y( l) = 0, 

integrate twice and develop the integral equation 

y(x) = J K(x,t)y(t)dt , 

with 

(a 2 t(l — x) + a A (x — 1), t < x, 

K(x, t) — < 

[a 2 x( 1 — t) + a t x, x < t. 

Note that K(x,t) is symmetric and continuous if a x = 0. How is this related 
to self-adjointness of the differential equation? 

16.1.5 Verify that \ x a \ x J(t)dtdx = £(x - t)f(t)dt for all f(t) (for which the integrals 
exist). 


16 . 1 .6 Given <p(x) = x — — x)cp(t)dt. Solve this integral equation by converting it 

to a differential equation (plus boundary conditions) and solving the differential 
equation (by inspection). 


16 . 1.7 Show that the homogeneous Volterra equation of the second kind 


\l/(x) = A 


K(x, t)\l*(t)dt 
o 


has no solution (apart from the trivial \j/ = 0). 

Hint. Develop a Maclaurin expansion of i//(x). Assume \j/(x) and K(x,t) are 
differentiable with respect to x as needed. 


1 6.2 INTEGRAL TRANSFORMS, GENERATING 
FUNCTIONS 

To put the problem of solving integral equations in perspective, we compare 
differentiation and integration: 
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Differentiation 

Integration 

Rules, systematic procedures 

Often no integrated 
function exists in 
closed form 

Computing machine can be 

Numerical integration 

instructed to do analytic 
differentiation 

may have to be used 


Analogous to differentiation, linear differential equations are solved com- 
pletely in Chapter 8. Analogous to integration, there is no general method 
available for inverting integral equations. However, certain special cases may 
be treated with our integral transforms (Chapter 15). For convenience these are 
listed here. If 


then 


< AM = 


V 27J J 


e lxt cp(t)dt , 


If 


then 


If 


then 


If 


then 


(p{x) = 




e ix ‘ij/(t)dt (Fourier). 


ij/(x) = e x, cp(t)dt. 


q>(x) = 


2ni 


7 + /oo 


e xt ij/(t)dt (Laplace). 


•AM 


f*o o 

= 1 


<p(t)dt. 


(*y+ loo 


<p(x) = — 

2m 


x V(0 dt (Mellin). 


Hx) = 


t(p(t)Jfxt)dt, 


(p{x) = t\l/(t)J v (xt)dt (Hankel). 


(16.35) 


(16.36) 


(16.37) 


(16.38) 


Actually the usefulness of the integral transform technique extends a bit 
beyond these four rather specialized forms. 
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EXAMPLE 16.2.1 Fourier Transform Solution 


Let us consider a Fredholm equation of the first kind with a kernel of the 
general type k(x — t): 


fix) = 


k(x — t)cp(t)dt, 


(16.39) 


in which cp(t) is our unknown function. Assuming that the needed transforms exist , 
we apply the Fourier convolution theorem (Section 15.5) to obtain 


f(x)= K(aj)4>(aj)e' i0>x da >. (16.40) 

J — QO 

The functions K(co) and O(cu) are the Fourier transforms of k(x) and cp(x\ 
respectively. Inverting, by Eq. 16.35, we have 


K{ co)O(co) = 


2n 


f(x)e lcox dx = 


Jin 


(16.41) 


Then 


®(eo) = 


1 f(m) 
K(a>Y 


(16.42) 


and again inverting we have 


(p(x) = 


2 n 


''oo 

J ~OD 


F(oj) -iox 
K( w) 


da). 


(16.43) 


For a rigorous justification of this result the reader is invited to follow Morse 
and Feshbach across complex planes. An extension of this transformation 
solution appears as Exercise 16.2.1. 


EXAMPLE 16.2.2 Generalized Abel Equation, Convolution Theorem 


The generalized Abel equation is 

<p(t) 


f(x) = 


0 (* - 0 : 


dt, 


0 < a < 1, with 


f{x) known, 
<p(t) unknown. 


(16.44) 


Taking the Laplace transform of both sides of this equation, we obtain 


&{f{x)}=X 




dt 


o i x - O' 

= (<p(x)}, 


(16.45) 


the last step following by the Laplace convolution theorem (Section 15.11). 
Then 


&{(p(x)} = 


s l -'#{fjx)} 

(-«)! 


(16.46) 
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Dividing by s, 1 we obtain 




s~*Se{f(x)} 

(-a)! 

2{ 


(16.47) 


(a — 1 ) ! ( — a) ! ' 

Combining the factorials (Eq. 10.32), and applying the Laplace convolution 
theorem again, we discover that 


( 16 . 48 ) 

Inverting with the aid of Exercise 15.11.1, we get 

(,6 - 49) 

and finally, by differentiating, 

, . sin no. d f* f(t ) . ,., cm 

T*ltrhr-*- (1650) 


Generating Functions 

Occasionally, the reader may encounter integral equations that involve 
generating functions. Suppose we have the admittedly special case 


/m = 


M dt 

(1 - 2 xt + x 2 ) 1 ' 2 ’ 


-1 < X < 1. 


(16.51) 


We notice two important features: 

1. (1 — 2 xt + x 2 ) _1/2 generates the Legendre poly- 
nomials. 

2. [ — 1, 1] is the orthogonality interval for the Legendre 
polynomials. 


If we now expand the denominator (property 1) and assume that our unknown 
cp(t) may be written as a series of these same Legendre polynomials, 

f*l OO 00 

/(*) = Z a„P n (t) Z P r (t)x'dt. (16.52) 

J-l «=0 r= 0 

Utilizing the orthogonality of the Legendre polynomials (property 2), we obtain 

fix) = y x r . (16.53) 

r =o 2r + 1 v 

We may identify the a n ’s by differentiating n times and then setting x = 0. 


1 s 1 a does not have an inverse for 0 < a < 1. 
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/ ln, (0) = n ! 


a„. 


In + 1 " 


Hence 




(16.54) 


(16.55) 


Similar results may be obtained with the other generating functions (compare 
Exercise 15.2.9). Actually the technique of expanding in a series of special 
functions is always available. It is worth a try whenever the expansion is possible 
(and convenient) and the interval is appropriate. 


EXERCISES 

1 6 . 2.1 The kernel of a Fredholm equation of the second kind, 

' f* 00 

<p(x) = J\x) + X K(x,t)(p{t)dt, 

J - 00 

is of the form k(x — t). 2 Assuming that the required transforms exist, show that 


(p(x) = 


1 


F(t)e~ ixl dt 


F(t) and K(t) are the Fourier transforms of f(x) and /c(x), respectively. 

1 6 . 2.2 The kernel of a Volterra equation of the first kind, 

fix) = f K(x,t)<p(t)dt, 

has the form k(x — t). Assuming that the required transforms exist, show that 

F(s) 


1 fy+io 


K(s) 


-e xs ds. 


F(s ) and K(s) are the Laplace transforms of f(x) and /c(x), respectively. 

1 6 . 2.3 The kernel of a Volterra equation of the second kind, 

cp(x) — f(x) + a J K(x , t)(p{t) dt 

has the form k(x — t). Assuming that the required transforms exist, show that 

, , 1 F(s) XSJ 

vW = t 777r-e ds - 

2 m Jy-iao 1 - AK ( S ) 

1 6 . 2.4 Using the Laplace transform solution (Exercise 16.2.3), solve 
(a) cp(x) = x + J (t — x)(p(t)dt , 

ANS. (p(x) ^ sinx, 


2 This kernel and a range 0 < x < oo are the characteristics of integral equa- 
tions of the Wiener -Hopf type. Details will be found in Chapter 8 of Morse 
and Feshbach. 
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(b) cp(x ) = x — J (t — x)(p(t)dt. 

ANS. <p(x) = sinhx. 

Check your results by substituting back into the original integral equations. 

1 6.2.5 Reformulate the equations of Example 16.21 (Eqs. 16.39 to 16.43), using Fourier 
cosine transforms. 


1 6.2.6 Given the Fredholm integral equation, 


e — \ e 


apply the Fourier convolution technique of Example 16.2.1 to solve for cp(t). 


1 6.2.7 Solve Abel’s equation, 


-■j^ L dt, 0 < a < 1 
(x - tf 


by the following method : 

(a) Multiply both sides by (z — x) a ~ l and integrate with respect to x over the 
range 0 < x < z. 

(b) Reverse the order of integration and evaluate the integral on the right-hand 
side (with respect to x) by the beta function. 

Note. 


t (z - x) 1 a (x - tf 


— J3(l — a, a) 

= (-«)!(«-!)! 


1 6.2.8 Given the generalized Abel equation with /(x) = 1, 

1 _f\ a... 


1 = - r - y - i -dt, 0 < a < 1. 
Jo ( * - f )“ 


Solve for cp{t) and verify that cp(t) is a solution of the preceding equation. 

ANS . <p(f)= ?“i 

71 

1 6.2.9 A Fredholm equation of the first kind has a kernel : 


f{x) = e {x 0 (p(t)dt. 


Show that the solution is 


<p{x) =y I 77T 

yfn n = 0 2 "n\ 

in which H n (x) is an nth-order Hermite polynomial. 

1 6.2.1 0 Solve the integral equation 


'W-h ( I -2 ^^ 
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for the unknown function <p(t) if 

(a) f(x) = x 2 \ 

(b) f(x) = x 2s+ '. 

ANS. (a) <p(t) = P 2s (tl 

4 s + 3 

(b) cp(t) = f-P 2s+l (t). 


1 6.2.1 1 A Kirchhoff diffraction theory analysis of a laser leads to the integral equation 


f(r 2 ) = y 



K(r t ,T 2 )v(ri)dA. 


The unknown ^(r^) gives the geometric distribution of the radiation field over 
one mirror surface; the range of integration is over the surface of that mirror. 
For square confocal spherical mirrors the integral equation becomes 


v(x 2 ,y 2 ) = 




g-Uklb^x^+f, y 2 ) t , ( x ^y x )dx x dy x . 


in which b is the centerline distance between the laser mirrors. This can be put 
in a somewhat simpler form by the substitutions 





2na 2 

~Jb 


(a) Show that the variables separate and we get two integral equations. 

(b) Show that the new limits +a may be approximated by Too for a mirror 
dimension a » /. 

(c) Solve the resulting integral equations. 


1 6.3 NEUMANN SERIES, SEPARABLE 
(DEGENERATE) KERNELS 

Many and probably most integral equations cannot be solved by the spe- 
cialized integral transform techniques of the preceding section. Here we develop 
three rather general techniques for solving integral equations. The first, due 
largely to Neumann, Liouville, and Volterra, develops the unknown function 
cp(x) as a power series in A, where k is a given constant. The method is applicable 
whenever the series converges. 

The second method is somewhat restricted because it requires that the two 
variables appearing in the kernel K(x, t) be separable. However, there are two 
major rewards: (1) the relation between an integral equation and a set of 
simultaneous linear algebraic equations is shown explicitly, and (2) the method 
leads to eigenvalues and eigenfunctions — in close analogy to Section 4.6. 

Third, a technique for numerical solution of Fredholm equations of both the 
first and second kind is outlined. The problem posed by ill-conditioned matrices 
is emphasized. 

Neumann Series 

We solve a linear integral equation of the second kind by successive approx- 
imations; our integral equation is the Fredholm equation 
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<P)x) = fix) + A 


f K(x,t)(p(t)dt 


J a 


(16.56) 


in which f(x) f 0. If the upper limit of the integral is a variable (Volterra equa- 
tion), the following development will still hold, but with minor modifications. 
Let us try (there is no guarantee that it will work) to approximate our unknown 
function by 


cp(x) « (p 0 (x) = f(x). (16.57) 

This choice is not mandatory. If you can make a better guess, go ahead and 
guess. The choice here is equivalent to saying that the integral or the constant 
X is small. To improve this first crude approximation, we feed (p 0 (x) back into 
the integral, Eq. 16.56, and get 

<Pi(x) = f(x) + A [ K{x,t)f(t)dt.. (16.58) 


Repeating this process of substituting the new cp n (x) back into Eq. 15.56, we 
develop the sequence 


<p 2 (x) = f (*) + / K(x,t x )f{t x )dt i 


+ X 2 


K(x, t t 2 )/(t 2 )dt 2 dt 


(16.59) 


and 

<Pn(x) = z (16.60) 

;= o 


where 


WoM = fix) 

«iW= f 


r*> rb 


u 2 (x) = 


Unix) = 


(16.61) 


K(x,t l )Kit 1 ,t 2 )fit 2 )dt 2 dt l 


X(x,r 1 )iC(r,.f 2 ) • • • K(r„-,,f„)- /(?„)*„ 


We expect that our solution cp(x) will be 


cp(x) ~ lim (p n {x) = lim V //w,(.x), (16.62) 

«-*• oo «-*• oo ._q 

provided that our infinite series converges. 

We may conveniently check the convergence by the Cauchy ratio test, 
Section 5.2, noting that 

\A"u„ix)\ < \A n \f\ m jK\” m Jb - a\\ (16.63) 
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using |/| max to represent the maximum value of |/(x)| in the interval [a, b] and 
| K 1 max to represent the maximum value of |K(x,f)| in its domain in the x, 
f-plane. We have convergence if 

\X\\K\ max \b-a\<l. (16.64) 

Note that Xu n ( max) is being used as a comparison series. If it converges, our 
actual series must converge. If this condition is not satisfied, we may or may not 
have convergence. A more sensitive test is required. Of course, even if the 
Neumann series diverges, there still may be a solution obtainable by another 
method. 

To see what has been done with this iterative manipulation, we may find it 
helpful to rewrite the Neumann series solution, Eq. 16.59, in operator form. 
We start by rewriting Eq. 16.56 as 

cp = XKcp + /, 

where K represents the integration operator jf,K(x, t)[ ]dt. Solving for cp, 
we obtain 


<P = (1 ~ AK)" 1 /. 

Binomial expansion leads to Eq. 16.59. The convergence of the Neumann series 
is a demonstration that the inverse operator (1 — X. K)~ l exists. 

EXAMPLE 16.3.1 Neumann Series Solution 

To illustrate the Neumann method, we consider the integral equation 


<p(x) = * + i 


(t — x)(p(t)dt. 


To start the Neumann series, we take 

(p 0 (x) = x. 

Then 


<M*) = * + 2 


(t — x)tdt 


= * + ili* 3 - ^*1 

= X + 3. 

Substituting ^(x) back into Eq. 16.65, we get 


cp 2 (x) — x + 


: x + - 


(t — x)tdt + 


1 


(t - x)-dt 


(16.65) 


(16.66) 


Continuing this process of substituting back into Eq. 16.65, we obtain 
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, . 1 x 1 

*,(*>-* + 3 - 3 - 31 , 

and by induction 

q> 2 „(x) = x+ X (— l) s_ 1 3 _s - x£ (— l)* _ 1 3~ s . (16.67) 

s=l S = 1 

Letting n -> 00 , we get 

<p(x) = Jx -b (16.68) 

This solution can (and should) be checked by substituting back into the original 
equation, Eq. 16.65. 

It is interesting to note that our series converged easily even though Eq. 16.64 
is not satisfied in this particular case. Actually Eq. 16.64 is a rather crude upper 
bound on X. It can be shown that a necessary and sufficient condition for the 
convergence of our series solution is that |/| < \X e \ 9 where X e is the eigenvalue 
of smallest magnitude of the corresponding homogeneous equation [ f(x) — 0)]. 
For this particular example X e = y/3/2. Clearly, X = jr < X e = y/3/2. 

One approach to the calculation of time-dependent perturbations in quantum 
mechanics starts with the integral equation for the evolution operator 

. pt 

U(t,t 0 )= l-j- V (t } )U(t 1 ,t 0 )dt l . (16.69 a) 

n , 

Jt Q 

Iteration leads to 

U(t,t 0 ) = l-± (' V(t l )dt 1 + (^j f f ' V(t 1 )V(t 2 )dt 2 dt l + ■••• (16.696) 
Ji 0 \ / Jt 0 Jr 0 

The evolution operator is obtained as a series of multiple integrals of the perturb- 
ing potential V(t\ closely analogous to the Neumann series, Eq. 16.60. For 
V = V 0 , independent of t, the evolution operator becomes 

U(t u t 0 ) = exp [-;'(? - t 0 )V 0 /h]. 

A second and similar relationship between the Neumann series and quantum 
mechanics appears when the Schrodinger wave equation for scattering is 
reformulated as an integral equation. The first term in a Neumann series 
solution is the incident (unperturbed) wave. The second term is the Born 
approximation, Eq. 16.191 of Section 16.6. 

The Neumann method may also be applied to Volterra integral equations 
of the second kind, Eq. 16.4 or Eq. 16.56 with the fixed upper limit b replaced 
by a variable x. In the Volterra case the Neumann series converges for all X 
as long as the kernel is square integrable. 

Separable Kernel 

The technique of replacing our integral equation by simultaneous algebraic 
equations may also be used whenever our kernel K(x, t) is separable in the 
sense that 
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K(x,t)= (16.70) 

j = i 

where n , the upper limit of the sum, is finite. Such kernels are sometimes called 
degenerate. Our class of separable kernels includes all polynomials and many 
of the elementary transcendental functions; that is, 

cos (f — x) — costcosx + sinfsinx. (16.70a) 

If Eq. 16.70 is satisfied, substitution into the Fredholm equation of the second 
kind, Eq. 16.2, yields 

<P(X) = f(x) + M j(x) f Nj(t)<p(t)dt, (16.71) 

* Ja 

interchanging integration and summation. Now the integral with respect to t 
is a constant, 

J Nj(t)(p(t)dt — Cp (16.72) 

Hence Eq. 16.71 becomes 

<p(x) = f(x) + k X CjMfx). (16.73) 

j- 1 

This gives us cp(x\ our solution, once the constants c f - have been determined. 
Equation 16.73 further tells us the form of cp(x ): /(x), plus a linear combination 
of the x-dependent factors of the separable kernel. 

We may find by multiplying Eq. 16.73 by iV,(x) and integrating to eliminate 
the x-dependence. Use of Eq. 16.72 yields 

Ci = b t + a ^ a u c j9 (16.74) 

where 


Cb 


k = N;(x)f(x)dx, 


Ja 

a u = f N,(x)Mj(x)dx. 

J a 

(16.75) 


It is perhaps helpful to write Eq. 16.74 in matrix form, with A = (a l7 ). 

b - c - Me = (1 - 2A)c, (16.76a) 

or 1 

c = (1 - 2A) _1 b. (16.76/7) 

Equation 16.76a is equivalent to a set of simultaneous linear algebraic equations 


Notice the similarity to the operator form of the Neumann series. 
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(1 - Aa ll )c l - Aa 12 c 2 - l a l3 c 3 -•••=&,, 

-A a 21 c l + (1 - Aa 22 )c 2 - Aa 23 c 3 - ■ ■ • = b 2 , (16.77) 

— Aa 3i c t ~ ^ a 32 c 2 + (1 — ^ 33)^3 — • • • = & 3 , and so on. 

If our integral equation is homogeneous, [/(%) = 0], then b = 0. To get a 
solution, we set the determinant of the coefficients of c, equal to zero, 

|1 — XA\ — 0, (16.78) 

exactly as in Section 4.6. The roots of Eq. 16.78 yield our eigenvalues. Sub- 
stituting into (1 — /A)c = 0, we find the c,’s and then Eq. 16.73 gives our 
solution. 

EXAMPLE 16.3.2 

To illustrate this technique for determining eigenvalues and eigenfunctions 
of the homogeneous Fredholm equation, we consider the simple case 

(p(x) = A ^ (r + x)(p(t)dt. (16.79) 

Here 


Afj = 1, M 2 (a) = x, 

N x {t) = t, AT 2 = 1. 

Equation 16.75 yields 

a l 1 = Cl 22 = 

#12 ” 3 » 
a 21 = 2 - 

Equation 16.78, our secular equation, becomes 

i 

3 =0. 
-2A 1 


Expanding, we obtain 




Substituting the eigenvalues 71 = + ^[2>/2 into Eq. 16.76, we have 



( 16 . 80 ) 


( 16 . 81 ) 


( 16 . 82 ) 
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Finally, with a choice of c, = 1, Eq. 16.73 gives 

= + V3x), ; - = ^' (>6.83) 

(Piix) = - V 2 5 ( 1 - V3.V), /. = - V * . ( 1 6.84) 

Since our equation is homogeneous, the normalization of <p(x) is arbitrary. 

If the kernel is not separable in the sense of Eq. 16.70, there is still the possi- 
bility that it may be approximated by a kernel that is separable. Then we can get 
the exact solution of an approximate equation, an equation that approximates 
the original equation. The solution of the separable approximate kernel problem 
can then be checked by substituting back into the original, unseparable kernel 
problem. 


Numerical Solution 

There is extensive literature on the numerical solution of integral equations, 
much of it concerns special techniques for certain situations. One method of 
fair generality is the replacement of the single integral equation by a set of 
simultaneous algebraic equations. And again matrix techniques are invoked. 
This simultaneous algebraic equation-matrix approach — is applied here to two 
different cases. For the homogeneous Fredholm equation of the second kind 
this method works well. For the Fredholm equation of the first kind the method 
is a disaster. First we deal with the disaster. 

We consider the Fredholm integral equation of the first kind 


/(*) = 


f K(xJ)<p(t)iit % 


Ja 


( 16.84u) 


with f(x) and K(x, t) known and (p(t) unknown. The integral can be evaluated 
(in principle) by quadrature techniques. For maximum accuracy the Gaussian 
method (Appendix 2) is recommended (if the kernel is continuous and has 
continuous derivatives). The numerical quadrature replaces the integral by a 
summation: 


/(*.•)= X A k K ( x ,'U)<P(hl (16.846) 

k= 1 

with A k the quadrature coefficients. We abbreviate / (.v,) as /•, q>(t k ) as <p k , and 
A k K(x h t k ) as B ik . In effect we are changing from a function description to a 
vector-matrix description with the n components of the vector ( )]) defined as 
the values of the function at the n discrete points [/(.v,)]. Equation 16.84 b 
becomes 


.A — Y Bi k q) k ' 

k= 1 

a matrix equation. Inverting (B ik ), we obtain 
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<p(x k ) = <p k = Z B^f h (I6.84c) 

k~l 

and Eq. 16.84a is solved — in principle. In practice, the quadrature coefficient- 
kernel matrix is often “ill-conditioned’' (with respect to inversion). This means 
that in the inversion process small (numerical) errors are multiplied by large 
factors. In the inversion process all significant figures may be lost and Eq. 16.84c 
becomes numerical nonsense. 

This disaster should not be entirely unexpected. Integration is essentially a 
smoothing operation. f(x) is relatively insensitive to local variation of (p{t ). 
Conversely, (p(t) may be exceedingly sensitive to small changes in f(x). Small 
errors in f(x) or in EP 1 are magnified and accuracy disappears. This same 
behavior shows up in attempts to invert Laplace transforms numerically — 
Section 15.8. 

When the quadrature— matrix technique is applied to the integral equation 
eigenvalue problem, the symmetric kernel, homogeneous Fredholm equation 
of the second kind, 2 

A(p(x) = f K(x, t)(p(t)dt, (16.84 d) 

J a 

the technique is far more successful. Replacing the integral by a set of simul- 
taneous algebraic equations (numerical quadrature. Appendix 2), we have 

Xcpi = £ A k K ik <p k , (16.84*) 

k = 1 

with = cpiXi) as before. The points x h i = 1, 2, . . . , n are taken to be the 
same (numerically) as t k , k — 1, 2, . . . , n, so that K ik will be symmetric. The 
system is symmetrized by multiplying by ,4, 1/2 so that 

= £ (A)l 2 K ik A[ !2 )(A£->f> k ). (16.84/) 

k = 1 

Replacing Al l2 (pj by (//, and A} l2 K ik Al /2 by S ik , we obtain 

- Si A, (16.84#) 

with S symmetric (since the kernel iC(x, t) was assumed symmetric, ip, of course, 
has components \j/ i = i /'(*;)• Equation 16.84# is our matrix eigenvalue equation, 
Eq. 4.146. The eigenvalues are readily obtained by calling the SSP EIGEN. 3 
For kernels such as those of Exercise 16.3.15 and using a 10-point Gauss- 
Legendre quadrature, EIGEN determines the largest eigenvalue to within about 
0.5 percent for the cases where the kernel has discontinuities in its derivatives. 
If the derivatives are continuous, the accuracy is much better. 


2 The eigenvalue X has been written on the left side, multiplying the eigen- 
function, as is customary in matrix analysis (Section 4.6). In this form A will 
take on a maximum value. 

3 The corresponding subroutine in the PL/1 Scientific Subroutine Package is 
MSDU. 
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Linz 4 has described an interesting variational refinement in the determination 
of A max to high accuracy. The key to his method is Exercise 17.8.7. The compo- 
nents of the eigenfunction vector are obtained from Eq. 16.84/ with cp(t k ) now 
known and q> 1 = generated as required. (The x, are no longer tied to the t k .) 


EXERCISES 

1 6.3.1 Using the Neumann series, solve 

(a) cp(x) = 1 — 2 J tcp(t)dt , 

(b) cp(x ) = x + (t — x)(p(t)dt , 

(c) <p(x) 

1 6.3.2 Solve the equation 


ANS. (a) <p(x) = e 


f 


(t — x)cp(t)dt. 


r l 

1 

2 

J-l 


<p(x) = x + £ (t + x)<p(r)dr 


by the separable kernel method. Compare with the Neumann method solution 
of Section 16.3. 

ANS. <p(x) = i(3x + 1). 


1 6 . 3.3 Find the eigenvalues and eigenfunctions of 

cp(x) = X J (t ~ x)(p(t)dt. 

1 6 . 3.4 Find the eigenvalues and eigenfunctions of 

r2n 

(p(x) = X cos(x — t)(p(t)dt. 


ANS. ^j=A 2 = -, 
n 

( p(x ) = A cosx + B sin x. 


1 6 . 3.5 Find the eigenvalues and eigenfunctions of 


y(x) 




(x - t) 2 y(t)dt. 


Hint. This problem may be treated by the separable kernel method or by a 
Legendre expansion. 

16 . 3.6 If the separable kernel technique of this section is applied to a Fredholm 
equation of the first kind, (Eq. 16.1), show that Eq. 16.76 is replaced by 

c = A _ 1 b. 

In general the solution for the unknown cp(t) is not unique. 


4 Peter Linz, “ On the numerical computation of eigenvalues and eigenvectors 
of symmetric integral equations Math. Computation, 24, 905 (1970). 
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16.3.7 Solve 


i j/(x) = x + 


(1 + xt)ip(t) dt 


by each of the following methods: 

(a) the Neumann series technique, 

(b) the separable kernel technique, 

(c) educated guessing. 


16.3.8 


Use the separable kernel technique to show that 


\j/(x) = X 


cos x sin tij/(t)dt 


Jo 

has no solution (apart from the trivial = 0). Explain this result in terms of 
separability and symmetry. 


16.3.9 Solve 


<p(x) == 1 + X 2 \ (x - t)q>(t)dt 
Jo 

by each of the following methods: 

(a) Reduction to a differential equation (including establishment of boundary 
conditions), 

(b) The Neumann series, 

(c) The use of Laplace transforms. 

ANS. <p(x) = cosh Xx. 


16.3.10 (a) In Eq. 16.69a take V = V 0 , independent of t. Without using Eq. 16.696, 
show that Eq. 16.69a leads directly to 

U(t- t 0 ) = exp[ — i(t - t 0 )V 0 /h\ 


(b) Repeat for Eq. 16.696 without using Eq. 16.69a. 


16.3.11 Given <p(x) = /JoU + xt)<p{t)dt, solve for the eigenvalues and the eigenfunctions 
by the separable kernel technique. 


16.3.12 Knowing the form of the solutions can be a great advantage, for the integral 
equation 


cp(x) = X 


*i 

(1 4- xt)(p(t)dt, 
o 


assume <p(x) to have the form 1 + 6x. Substitute into the integral equation. 
Integrate and solve for 6 and X. 


16.3.13 The integral equation 


<p(x) = A 


J 0 (axt)<p(t)dt, 


J o (*) = 0 


is approximated by 
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cp(x) = X 



x 2 t 2 ](p{t)dt. 


Find the minimum eigenvalue X and the corresponding eigenfunction (p(t) of 
the approximate equation. 


ANS. X mm = 1.112486 

(p(x) = 1. - 0.303337x 2 . 


1 6.3.1 4 You are given the integral equation 

(p(x) = X \ sinnxt (p(t)dt. 

Jo 

Approximate the kernel by 

K(x , t) = 4(xf)(l — xt) « sin 7ixf. 

Find the positive eigenvalue and the corresponding eigenfunction for the 
approximate integral equation. 

Note . For K(x, t) — sin nxt, X — 1.6334. 

ANS. >1 = 1.5678 

< p(x ) = x — 0.6955x 2 

(A + = V3T - 4, A_ = -V3T - 4) 


16 . 3.15 The equation 


f(x) ■■ 


K(x,t)(p(t)dt 


has a degenerate kernel K(x , t) = 

(a) Show that this integral equation has no solution unless j\x) can be written 
as 

fix) = i f,M,(x\ 

i= 1 

with the f constants. 

(b) Show that to any solution <p(x ) we may add \j/{x\ provided ij/(x) is ortho- 
gonal to all Nf(x): 


*h 

N i (x)i)/(x)dx = 0 for all i. 
Ja 


16 . 3.16 Using numerical quadrature, convert 


<p(x) = X 


'i 

J 0 (axt)(p(t)dt y 

Jo 


J 0 («) = 0 


to a set of simultaneous linear equations. 

(a) Find the minimum eigenvalue X. 

(b) Determine cp(x ) at discrete values of x and plot <p(x ) versus x. Compare 
with the approximate eigenfunction of Exercise 16.3.13. 

ANS. (a) = 1.14502. 


1 6 . 3.1 7 Using numerical quadrature, convert 
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to a set of simultaneous linear equations. 

(a) Find the minimum eigenvalue X. 

(b) Determine (p(x ) at discrete values of x and plot <p(x) versus x. Compare 
with the approximate eigenfunction of Exercise 16.3.14. 

ANS. (a) 2 min = 1.6334 


1 6 . 3 . 1 8 Given a homogeneous Fredholm equation of the second kind 


X(p(x) = J K(x,t)cp(t)dt. 

(a) Calculate the largest eigenvalue A 0 . Use the 10-point Gauss-Legendre 
quadrature technique. For comparison the eigenvalues listed by Linz are 
given as X exact . 

(b) Tabulate (p{x k \ where the x k are the 10 evaluation points in [0, 1]. 

(c) Tabulate the ratio 


"l 

* 0 


K(x , t)cp(t) dt/X 0 (p(x) for x = x k . 


This is the test of whether or not you really have a solution, 
(a) K(x,t) = e xt . 


(b) K(x,r) = ii 

(c) K(x,t) — 

(d) X(x,0 = 


jx(2 — t\ x < t 
2 — x), x > t. 


x — t . 


ix, x<t 

(t, x> t. 

ANS. ^ cxact = 0.40528. 

Note. (1) The evaluation points x t of Gauss-Legendre quadrature for [— 1, 1] 
may be linearly transformed into [0, 1], 

x it°’ !] = 1. 1] + !)• 


Then the weighting factors A t are reduced in proportion to the length of the 
interval 


A,[ 0 ; 1]=K[-1,1]. 

1 6 . 3.1 9 Using the matrix variational technique of Exercise 17.8.7, refine your calculation 
of the eigenvalue of Exercise 16.3.18(c) [K(x, t) = |x — t|]. Try a 40 x 40 matrix. 
Note. Your matrix should be symmetric so that the (unknown) eigenvectors 
will be orthogonal. 

ANS. (40 point Gauss-Legendre quadrature) 0.34727. 


16.4 HILBERT- SCHMIDT THEORY 

Symmetrization of Kernels 

This is the development of the properties of linear integral equations 
(Fredholm type) with symmetric kernels. 

K(x,t) = K(t,x). (16.85) 

Before plunging into the theory, we note that some important nonsymmetric 
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kernels can be symmetrized. If we have the equation 


<P(x) = f(x) + X 


K(x, t)p(t)(p(t)dt. 


(16.86) 


the total kernel is actually K(x, t)p{t\ clearly not symmetric if K(x, t) alone is 
symmetric. However, if we multiply Eq. 16.86 by yjp(x ) and substitute 

y/p(x)(p(x) = (16.87) 

we obtain 


i Hx) = VpW/W + ^ 


[K (x, t) y /p(x)p(t)'] <A(f ) dt , 
J a 


(16.88) 


with a symmetric total kernel, K(x , Ox/pWpW- We shall meet p(x) later as a 
weighting factor in this integral equation Sturm-Liouville theory. 


Orthogonal Eigenfunctions 

We now focus on the homogeneous Fredholm equation of the second kind : 


(p{x) = X K(x,t)<p(t)dt. 


(16.89) 


We assume that the kernel K(x, t) is symmetric and real. Perhaps one of the 
first questions the mathematician might ask about the equation is, “Does it 
make sense?” or more precisely, “Does an eigenvalue X satisfying this equation 
exist?” With the aid of the Schwarz and Bessel inequalities, Courant and Hilbert 
(Chapter III, Section 4) show that if K(x,t) is continuous, there is at least one 
such eigenvalue and possibly an infinite number of them. 

We show that the eigenvalues, a, are real and that the corresponding eigen- 
functions, (Pi(x\ are orthogonal. Let X h X i be two different eigenvalues and <p,(x), 
(Pj(x), the corresponding eigenfunctions. Equation 16.89 then becomes 


(p i (x) = X i K{x, t)<pi(t) dt, 


<Pj(x) = Xj \ K(x,t)(pi(t)dt. 


(16.90a) 


(16.90b) 


If we multiply Eq. 16.90a by Xj(pj(x), Eq. 16.90b by vf and then integrate 
with respect to x, the two equations become 1 


K(x,t)(p,(t)(Pi(x)dtdx, (16.91a) 


Xj (pj(x)(pj(x) dx = X;Xj 


J a 


(x)(pj(x)dx = Xj X: 


K(x,t)(pj(t)(Pi(x)dtdx. (16.91 b) 


1 We assume that the necessary integrals exist. For an example of a simple 
pathological case, see Exercise 16.4.3. 
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Since we have demanded that K(x , t) be symmetric, Eq. 16.91 b may be rewritten 
as 


k t (pi(x)(pj(x)dx = AjL 


rb rb 

K(x,t)<p, 

ut J a 


t)<Pi(t)(pj(x) dt dx. 


Subtracting Eq. 16.92 from Eq. 16.91a, we obtain 


<Pi(x)q>j(x)dx = 0 . 


(16.92) 


(16.93) 


This has the same form as Eq. 9.33 in the Sturm-Liouville theory. Since 2, ^ k p 

* b 

(pi(x)cpj(x) dx = 0, i ± j , (16.94) 

J a 


proving orthogonality. Note that with a real symmetric kernel no complex 
conjugates are involved in Eq. 16.94. For the self-adjoint or Hermitian kernel 
see Exercise 16.4.1. 

If the eigenvalue is degenerate, 2 the eigenfunctions for that particular 
eigenvalue may be orthogonalized by the Gram-Schmidt method (Section 9.3). 
Our orthogonal eigenfunctions may, of course, be normalized, and we assume 
that this has been done. The result is 


<Pi(x)<pj(x)dx = 8 U . 


(16.95) 


To demonstrate that the k { are real, we need to get into complex conjugates. 
Taking the complex conjugate of Eq. 16.90a, we have 


<p*(x) = X* K(x , t)(pf(t)dt. 


(16.96) 


provided the kernel K(x, t) is real. Now, using Eq. 16.96 instead of Eq. 16.906, 
we see that the analysis leads to 


W - A,) 


f (p*(x)(Pi(x)dx = 0. 


J a 


(16.97) 


This time the integral cannot vanish (unless we have the trivial solution, 
(Pi(x) = 0) and 

kf = k { (16.98) 


or k i9 our eigenvalue, is real. 

If readers feel that somehow this state of affairs is vaguely familiar, they are 
right. This is the third time we have passed this way, first with Hermitian 
matrices, then with Sturm-Liouville (self-adjoint) equations, and now with 
Hilbert-Schmidt integral equations. The correspondence between the Hermit- 
ian matrices and the self-adjoint differential equations shows up in modern. 


2 If more than one distinct eigenfunction corresponds to the same eigenvalue 
(satisfying Eq. 16.89), that eigenvalue is said to be degenerate. 
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physics as the two outstanding formulations of quantum mechanics — the 
Heisenberg matrix approach and the Schrodinger differential operator ap- 
proach. In Section 16.5 we shall explore further the correspondence between 
the Hilbert-Schmidt symmetric kernel integral equations and the Sturm- 
Liouville self-adjoint differential equations. 

The eigenfunctions of our integral equation form a complete set 3 in the sense 
that any function g(x) that can be generated by the integral 

g(x) = J K(x,t)h(t)dt , (16.99) 

in which h(t) is any piecewise continuous function, can be represented by a 
series of eigenfunctions, 

g (x) = £ a n q> n (x). (16.100) 

n = 1 


The series converges uniformly and absolutely. 

Let us extend this to the kernel, K(x , t), by asserting that 

K(x,t) = £ a„(p„(l), (16.101) 

n = 1 

and a n = a n (x). Substituting into the original integral equation (Eq. 16.89) and 
using the orthogonality integral, we obtain 

(Pi(x) = A f a f (x). (16.102) 

Therefore for our homogeneous Fredholm equation of the second kind the 
kernel may be expressed in terms of the eigenfunctions and eigenvalues by 


K(x,t)= Y, (zero not an eigenvalue). (16.103) 

n = l 

Here we have a bilinear expansion, a linear expansion in cp n (x) and linear in 
(p n (t). Similar bilinear expansions appear in Section 8.7. It is possible that the 
expansion given by Eq. 16.101 may not exist. As an illustration of the sort of 
pathological behavior that may occur, the reader is invited to apply this analysis 
to 


(p(x) = X 


"ao 

e~ xt cp{t)dt 

Jo 


(compare Exercise 16.4.3). 

It should be emphasized that this Hilbert-Schmidt theory is concerned with 
the establishment* of properties of the eigenvalues (real) and eigenfunctions 
(orthogonality, completeness), properties that may be of great interest and 
value. The Hilbert-Schmidt theory does not solve the homogeneous integral 
equation for us any more than the Sturm-Liouville theory of Chapter 9 solved 


For a proof of this statement see Courant and Hilbert, Chapter III, Section 5. 
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the differential equations. The solutions of the integral equation come from 
Sections 16.2 and 16.3 (including numerical analysis). 

Nonhomogeneous Integral Equation 

We need a solution of the nonhomogeneous equation 

cp(x) = f(x) + a f K(x, t)cp(t)dt. (16.104) 

J a 

Let us assume that the solutions of the corresponding homogeneous integral 
equation are known. 


<p„(x) = K I K(x, t)(p„(t) dt, (16.105) 

J a 

the solution cp„(x) corresponding to the eigenvalue X n . We expand both (p(x ) 
and f(x) in terms of this set of eigenfunctions 

00 


<p(x) — X a n ( Pn( x )y ( a n unknown) (16.106) 

n = 1 


fix) = X b„cp„(x). 
11 = 1 


(b„ known) 


(16.107) 


Substituting into Eq. 16.104, we obtain 

oo oo rb oo 

X a > t<Pnix) = x b n<P»(x) + A K(x,t) X a„(p n (t)dt. (16.108) 

n ~ 1 n = 1 J a «=1 

By interchanging the order of integration and summation, we may evaluate the 
integral by Eq. 16.105, and we get 


00 00 00 n tr> Ivi 

X a n( p„(x) = X b„(pjx) + A X (16.109) 

n — 1 n = 1 m =1 

If we multiply by <p t (x) and integrate from x = a to x = b, the orthogonality 
of our eigenfunctions leads to 


i - o, 
a,- = + X~r- 

A; 


This can be rewritten as 


a . — b t + - — — T h, 

/: — X 


which brings us to our solution 


= fix) + A X 


r 

OO 

2 V •!« 


f(t)<pi(t)dt 


,-t) A, - A 


-<p,(x). 


(16.110) 

(16.111) 


(16.112) 


Here it is assumed that the eigenfunctions, <p,(x), are normalized to unity. 
Note that if f(x) = 0 there is no solution unless X — X { . This means that our 
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homogeneous equation has no solution (except the trivial cp{x) — 0) unless 2 
is an eigenvalue, 2 f . 

In the event that 2 for the nonhomogeneous equation (16. 104) is equal to one 
of the eigenvalues, A p , of the homogeneous equation , our solution (Eq. 16.112) 
blows up. To repair the damage we return to Eq. 16.110 and give the value 

a p = K + 2 p ^ = b p + a p (16.113) 

a p 

special attention. Clearly, a p drops out and is no longer determined by h p , 
whereas b p = 0. This implies that j f(x)cp p (x)dx = 0, that is,/(x) is orthogonal 
to the eigenfunction <p p (x). If this is not the case, we have no solution. 

Equation 16.111 still holds for i =/= p , so we multiply by <p,(x) and sum over 
/(/ ^ p ) to obtain 

„ I 

(p{x) = fix) + a p (p p + A p £ — — : (16.1 14) 

1 = 1 A i ~ > 

*+P 

the prime emphasizes that the value i = p is omitted. In this solution the a p 
remains as an undetermined constant. 4 


EXERCISES 


1 6.4.1 In the Fredholm equation 


cp(x) = 2 


' b 

K{x , t)cp(t)dl 

Ja 


the kernel K(x , t) is self-adjoint or Hermitian. 


K(x,t) = K*(t,x). 

Show that 

(a) the eigenfunctions are orthogonal in the sense 

f <P*(x)<p„ix)dx = 0, m f n (A,„ f 


(b) the eigenvalues are real. 


1 6.4.2 Solve the integral equation 

(p(x) = x + j J (f + x)(p(t) dt 

(compare Exercise 16.3.2) by the Hilbert-Schmidt method. 

The application of the Hilbert-Schmidt technique here is somewhat like using 
a shotgun to kill a mosquito, especially when the equation can be solved in about 
15 seconds by expanding in Legendre polynomials. 


4 This is like the inhomogeneous linear differential equation. We may add to 
its solution any constant times a solution of the corresponding homogeneous 
differential equation. 
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16.4.3 


16.4.4 


16.4.5 


16.4.6 


Solve the Fredholm integral equation 


<p(x) = A 


f* 00 

e~ xt cp(t)dt. 

Jo 


Note. A series expansion of the kernel e~ xt would permit a separable kernel-type 
solution (Section 16.3), except that the series is infinite. This suggests an infinite 
number of eigenvalues and eigenfunctions. If you stop with 


</>(x) = x 1/2 , 
/ = t r" 1/2 , 


you will have missed most of the solutions ! Show that the normalization integrals 
of the eigenfunctions do not exist. A basic reason for this anomalous behavior 
is that the range of integration is infinite, making this a “singular” integral 
equation. 


Given 

y (x) — x + X J xt y(t) dt. 

(a) Determine y(x) as a Neumann series. 

(b) Find the range of X for which your Neumann series solution is convergent. 
Compare with the value obtained from 

WI*U < !• 

(c) Find the eigenvalue and the eigenfunction of the corresponding homoge- 
neous integral equation. 

(d) By the separable kernel method show that the solution is 



(e) Find y(x) by the Hilbert-Schmidt method. 
In Exercise 16.3.4 


K(x , t ) = cos(x — t). 


The (unnormalized) eigenfunctions are cos x and sin x. 

(a) Show that there is a function h{t) such that K(x, s), considered as a function 
of s alone, may be written as 


K(x,s) ~ 


"In 

K(s,t)h(t)dt. 

0 


(b) Show that K(x , t) may be expanded as 


K(x,t)= X 

n = l 


<Pn(x)<Pn(t) 


The integral equation <p(x) = A JJ(1 + xt)<p(t)dt has eigenvalues X x = 0.7889 and 
X 2 — 15.211 and eigenfunctions (p t = 1 -f- 0.5352x and cp 2 = 1 — 1.8685x. 

(a) Show that these eigenfunctions are orthogonal over the interval [0,1]. 

(b) Normalize the eigenfunctions to unity. 

(c) Show that 


K( x t) = V'WViM + 

X\ X 2 
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ANS. (b) q> t (x) = 0.7831 + 0.4191* 
<p 2 (x) = 1.8403 - 3.4386x 

16 . 4.7 An alternate form of the solution to the nonhomogeneous integral equation, 
Eq. 16.104, is 

V(x) = Z -^—(pAx). 

i—1 *i A 

(a) Derive this form without using Eq. 16.112. 

(b) Show that this form and Eq. 16.112 are equivalent. 

1 6 . 4.8 (a) Show that the eigenfunctions of Exercise 16.3.5 are orthogonal. 

(b) Show that the eigenfunctions of Exercise 16.3.1 1 are orthogonal. 


16.5 GREEN'S FUNCTIONS— ONE DIMENSION 

As part of the investigation of differential operators in Section 8.7, we see that 
Poisson’s equation of electrostatics 

V 2 <p(r )=-£& (16.115) 

>k, 

has a solution 

(16116) 

Here we have the infinite case in which the range of integration covers all space. 
If desired, the potential cp( rj may be developed for a finite case by using appro- 
priate charge and dipole layer distributions on the boundaries. 1 
Equation 16.116 may be given two interpretations. 

1. If the potential function <p(rj) is known and we seek 
the charge distribution p(r 2 ), which produces the 
given potential, Eq. 16.116 is an integral equation for 

P(r 2 > 

2. If the charge distribution p(r 2 ) is known, Eq. 16.116 
yields the electrostatic potential cp( r x ) as a definite 
integral. 

Following up this second (and more frequently encountered) situation, we 
may use the physicists’ customary* cause and effect vocabulary. We might label 
p(r 2 ) the “cause” that gives rise to the “effect” cp( r t ); that is, the charge distribu- 
tion produces a potential field. However, the effectiveness of the charge in 
producing this potential depends on the distance between the element of charge 
p(r 2 )dT 2 and the point of interest given by r t . This effectiveness or, let us say, 
the influence of the element of charge is given by the function (47^^ — r 2 |) k 

Compare J. A. Stratton, Electromagnetic Theory. New York: McGraw-Hill 
(1941). 
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For this reason (47r|r 1 — r 2 |) 1 is often called an influence function. Although 
we relabel it a Green’s function, the physical basis for the term influence function 
remains important and may well be helpful in determining the form of other 
Green’s functions. 

Also in Section 8.7, the Green’s function (for the operator V 2 ) is described as 
satisfying the point source equation 

V 2 G(r 1; r 2 ) = —(Hr, - r 2 ). (8.122) 

A detailed discussion of the Dirac delta function in terms of sequences is 
included. Using Eq. 8.122 and Green’s theorem, Section 1.11, the Green’s 
function is shown to be symmetric: 

G(t 19 t 2 ) = GfaTj. (8.139) 

In Section 9.4 the Dirac delta and Green’s functions are expanded in series 
of eigenfunctions. These expansions make the symmetry properties explicit. 

Moving into this chapter, in Section 16.1 it is seen that the integral equation 
corresponding to a differential equation and certain boundary conditions may 
lead to a peculiar kernel. This kernel is our Green’s function. 

The development of Green’s functions from Eq. 8.122 for two- and three- 
dimensional systems is the topic of Section 16.6. Here, for simplicity, we restrict 
ourselves to one-dimensional cases and follow a somewhat different approach. 2 

Defining Properties 

In our one-dimensional analysis we consider first 
Sturm-Liouville equation (Chapter 9) 

&y(x) + fix) = o, 

in which 2*? is the self-adjoint differential operator 

& = 4-(p( x )-f ) + <?(*)• 

dx\ dxj 

As in Section 9.1, y(x) is required to satisfy certain boundary conditions at the 
end points a and b of our interval [a, b]. Indeed, the interval may well be chosen 
so that appropriate boundary conditions can be satisfied. We now proceed to 
define a rather strange and arbitrary function G over the interval [a, ft]. At this 
stage the most that can be said in defense of G is that the defining properties 
are legitimate, or mathematically acceptable. 3 Later, it is hoped, G may appear 
reasonable if not obvious. 

1. The interval a < x < b is divided by a parameter t . 


the nonhomogeneous 

(16.117) 

(16.118) 


2 Equation 8.122 can be used for one-dimensional systems. The relationship 
between these two different approaches to Green’s functions is shown at the 
end of this section. 

3 Note, however, that these properties are just those of the kernel of the 
Fredholm equation that had been derived from a self-adjoint differential 
equation, Example 16.1.3. 



GREEN'S FUNCTIONS— ONE DIMENSION 899 


We label G(x) = G^x) for a < x < t and G(x) = 
G 2 (x) for t < x < b. 

2. The functions G { (x) and G 2 (x) each satisfy the 
homogeneous 4 Sturm-Liouville equation; that is, 

5£G x (x) = 0, a < x < r, 

(16.119) 

ifG 2 (x) = 0, t < x < b. 

3. At x = a, G x (x) satisfies the boundary conditions we 
impose on y(x). At x = b, G 2 (x) satisfies the boundary 
conditions imposed on y(x) at this end point of the 
interval. For convenience in renormalizing the 
boundary conditions are taken to be homogeneous; 
that is, at x = a 

y(a) = 0, 


or 


?(a) = 0 , 


or 


ay (a) + f}y'(a) = 0 

and similarly for x — b. 

4. We demand that G(x) be continuous , 5 

lim G 1 (x)= lim G 2 (x). (16.120) 

-X + 

5. We require that G'(x) be discontinuous , specifically 
that 5 


A 

dx 


G 2 (x) 



P(tY 


(16.121) 


where pit) comes from the self-adjoint operator, Eq. 
16.118. Note that with the first derivative discontin- 
uous the second derivative does not exist. 


These requirements, in effect, make G a function of two variables, G(x,f). 
Also, we note that G(x, t) dependa on both the form of the differential operator 
££ and the boundary conditions that y(x) must satisfy. 

Now, assuming that we can find a function G(x, t) that has these properties, 
we label it a Green’s function and proceed to show that a solution of Eq. 16.1 17 
is 


4 Homogeneous with respect to the unknown function. The function /(.v) in 
Eq. 16.117 is set equal to zero. 

5 Strictly speaking, this is the limit as x -> t. 
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y(x) = 


G{x, t)f(t)dt. 


(16.122) 


To do this we first construct the Green’s function, G(x,t). Let u(x ) be a solution 
of the homogeneous Sturm-Liouville equation that satisfies the boundary 
conditions at x = a and v(x) is a solution that satisfies the boundary conditions 
at x = b. Then we may take 6 


G(x, t) 


UiUM, 

\c 2 v(x), 


a < x < t, 
t < x < b. 


(16.123) 


Continuity at x = t (Eq. 16.120) requires 


c 2 v(t) — c 1 u(t) = 0. 


(16.124) 


Finally, the discontinuity in the first derivative (Eq. 16.121) becomes 


c 2 v'(t ) — c l u'(t) = 


J 

P(t) 


(16.125) 


There will be a unique solution for our unknown coefficients c x and c 2 if the 
Wronskian determinant 


u(t) 
u'(t ) 


v{t) 

v'(t) 


= u(t)v'(t) — v(t)u'(t) 


does not vanish. We have seen in Section 8.6 that the nonvanishing of this 
determinant is a necessary condition for linear independence. Let us consider 
u(x) and v(x) to be independent. The contrary, which occurs when i/(x) satisfies 
the boundary conditions at both end points, requires a generalized Green’s 
function. Strictly speaking, no Green’s function exists when n(x) and v(x) are 
linearly dependent. This is also true when X — 0 is an eigenvalue of the homoge- 
neous equation. However, a “generalized Green’s function’’ may be defined. 
This situation, which occurs with Legendre’s equation, is discussed in Courant 
and Hilbert and other references. For independent u(x) and r(x) we have the 
Wronskian (again from Section 8.6 or Exercise 9.1.4) 


u(t)v'(t) — v(t)u'(t) = 


P(tY 


(16.126) 


in which A is a constant. Equation 16.126 is sometimes called Abel’s formula. 
Numerous examples have appeared in connection with Bessel and Legendre 
functions. Now, from Eq. 16.125, we identify 



c 7 = 


u(t) 
A ' 


(16.127) 


6 The “constants” c x and c 2 are independent of x, but they may (and do) 
depend on the other variable, t. 



GREEN'S FUNCTIONS— ONE DIMENSION 901 


Equation 16.124 is clearly satisfied. Substitution into Eq. 16.123 yields our 
Green’s function. 


G(x, t) = 


— ju(x)v(t), 
A 

jU(t)v(x) 9 

A 


a < x < t, 
t < x <b. 


(16.128) 


Note carefully that G(x, t) = G(r,x). This is the symmetry property that was 
proved earlier in Section 8.7. Its physical interpretation is given by the reciproc- 
ity principle (via our influence function) — a cause at t yields the same effect at 
x as a cause at x produces at t. In terms of our electrostatic analogy this is 
obvious, the influence function depending only on the magnitude of the distance 
between the two points 

I 1- ! - T l\ = | r 2 - r , |- 


Green's Function Integral — Differential Equation 

We have constructed G(x, £), but there still remains the task of showing that 
the integral (Eq. 16.122) with our new Green’s function is indeed a solution of 
the original differential equation (16.117). This we do by direct substitution. 
With G(x, t) given by Eq. 16.128, 7 Eq. 16.122 becomes 


y(x) = 


1 


Differentiating, we obtain 
/(*) = — T 


v{x)u(t)f(t)dt 


v'(x)u(t)f(t)dt - 


±r 


u(x)v(t)f(t)dt. 


(16.129) 


1 


u'(x)v(t)f(t)dt , (16.130) 


the derivatives of the limits canceling. A second differentiation yields 


f(x) = 


A 

1 


v"(x)u(t)J'(t)dl - ~ | u"(x)v(t)J(t)dt 


[m(x)c'(x) - r(x)u'(x)]/(x). 


(16.131) 


By Eqs. 16.125 and 16.127 this may be rewritten as 


y"(x) = - 


v"(x) 


u(t)f(t)dt — 


m"(x) 


h v(t)J\t)dt-^\. 

p(x) 


Now, by substituting into Eq. 16.118, we have 


$ey{x) = - 

/x 


f um 

J a 


(t)f(t)dt - 


[ifu(x)] 


rb 


(16.132) 


v(t)J\t)dt - /(x). 


(16.133) 


7 In the first integral a < t < x. Hence G(x, t) = G 2 (x,t ) = — (1 /A)u{t) e(x). 
Similarly, the second integral requires G = G x . 
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Since u(x) and u(x) were chosen to satisfy the homogeneous Sturm-Liouville 
equation, the factors in brackets are zero and the integral terms vanish. Trans- 
posing /(x), we see that Eq. 16.117 is satisfied. 

We must also check that y(x) satisfies the required boundary conditions. 
At point x = a 


y(a) = 


y'(a) = 


u(a) 

A 


v(t)j\t)dt = cu(a ), 


v(t)f(t)dt = cu'(a). 


(16.134) 

(16.135) 


since the definite integral is a constant. We chose u(x) to satisfy 

a u{a) + fiu'(a ) = 0 . (16.136) 

Multiplying by the constant c, we verify that y(x) also satisfies Eq. 16.136. This 
illustrates the utility of the homogeneous boundary conditions: The normaliza- 
tion does not matter. In quantum mechanical problems the boundary condition 
on the wave function is often expressed in terms of the ratio 


^'(x) 

\j/(x) 


d_ 

dx 


In \jj (x), 


equivalent to Eq. 16.136. The advantage is that the wave function need not be 
normalized. 

Summarizing, we have Eq. 16.122 


y{x) = 


G(x,t)f(t)dt , 


which satisfies the differential equation (Eq. 16.117) 

&y(x) + f(x) = 0 

and the boundary conditions, these boundary conditions having been built into 
the Green’s function G(x, t ). 

Basically, what we have done is to use the solutions of the homogeneous 
Sturm-Liouville equation to construct a solution of the nonhomogeneous 
equation. Again, Poisson’s equation is an illustration. The solution (Eq. 16.116) 
represents a weighted [p(r 2 )] combination of solutions of the corresponding 
homogeneous Laplace’s equation. (We did this same sort of thing in Section 
16.4). 

It should be noted that our y(x), Eq. 16.122, is actually the particular solution 
of the differential equation, Eq. 16.117. Our boundary conditions exclude the 
addition of solutions of the homogeneous equation. In an actual physical 
problem we may well have both types of solutions. In electrostatics, for instance 
(compare Section 8.7), the Green’s function solution of Poisson's equation gives 
the potential created by the given charge distribution. In addition, there may 
be external fields superimposed. These would be described by solutions of the 
homogeneous equation, Laplace. 
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Eigenfunction, Eigenvalue Equation 

The preceding analysis placed no special restrictions on our f(x). Let us 
now assume that f(x) — Ap(x)y(x). 8 Then we have 


as a solution of 


y(x) = A 


f G(x,t)p{t)y(t)dt 


J a 


(16.137) 


ify(x) + A p(x)y(x) = 0 (16.138) 

and its boundary conditions. Equation 16.137 is a homogeneous Fredholm 
equation of the second kind and Eq. 16.138 is the Sturm-Liouville eigenvalue 
equation of Chapter 9 [with the weighting function w(x) replaced by p(x)]. 

Notice the change from Eqs. 16.117 and 16.122 to 16.137 and 16.138. There 
is a corresponding change in the interpretation of our Greens function. It 
started as an importance or influence function, a weighting function giving the 
importance of the charge p(r 2 ) in producing the potential <p(r,)- The charge p 
was the nonhomogeneous term in the nonhomogeneous differential equation 
16.117. Now the differential equation and the integral equation are both 
homogeneous. G(x,t) has become a link relating the two equations, differential 
and integral. 

To complete the discussion of this differential equation^-integral equation 
equivalence — let us now show that Eq. 16.138 implies Eq. 16.137; that is, a 
solution of our differential equation (16.138) with its boundary conditions 
satisfies the integral equation (16.137). We multiply Eq. 16.138 by G(x, t\ the 
appropriate Green’s function, and integrate from x = a to x — b io obtain 

J G(x, t)J£y(x)dx 4- A f G(x, t)p(x)y(x)dx — 0. (16.139) 

J a Ja 

The first integral is split in two (x < /, x > t), according to the construction of 
our Green’s function, giving 

— G 1 (x,t)^ ? y(x)dx ~ f G 2 (x, t)J?y(x)dx = A f G(xj)p(x)y(x)dx. 

J Cl Jt J u 

(16.140) 


Note that t is the upper limit for the G x integrals and the lower limit for the G 2 
integrals. We are going to reduce the left-hand side of Eq. 16.140 to y(t). Then, 
with G(x, t) = G(r,x), we have Eq. 16.137 (with x and t interchanged). 

Applying Green’s theorem to the left-hand side or, equivalently, integrating 
by parts, we obtain 



+ q(x)y(x) 


dx 


= -|G 1 (x,t)p(x)y(x)|' fl + 


G[(x,t)p(x)y\x)dx 

Ju 


Gj(x, t)q(x)y(x)dx, 

( 16 . 141 ) 


8 The function p(x) is a weighting function, not a charge density . 
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with an equivalent expression for the second integral. A second integration by 
parts yields 

%t rt 

— G 1 (x,t)^ ? y(x)dx = — y(x)J?G 1 {x,t)dx 

J a *) a 

- \G 1 (x,t)p(x)y'(x)\l, + \G\{x,t)p(xjy(x)\‘ ll . (16.142) 

The integral on the right vanishes because J£G X = 0. By combining the inte- 
grated terms with those from integrating G 2 , we have 

-p(f)[Gi(6 t)y'(t) - G[(t, t)y(t) - G 2 (t,t)y'(t ) + G^(t,t)y(r)] 

+ P(a)lGAa,t)y'(a ) - G[(a,t)y(a )] - p(b)[G 2 (b, t)y'{b) - G' 2 (bj)y(b)\. 

(16.143) 

Each of the last two expressions vanishes, for G(x, t) and y(x) satisfy the same 
boundary conditions. The first expression, with the help of Eqs. 16.120 and 
16.121, reduces to y(t). Substituting into Eq. 16.140, we have Eq. 16.137, thus 
completing the demonstration of the equivalence of the integral equation and 
the differential equation plus boundary conditions. 

EXAMPLE 16.5.1. Linear Oscillator 


As a simple example, consider the linear oscillator equation (for a vibrating 
string) 

f(x) + Ay(x) = 0. (16.144) 

We impose the conditions y(0) = y(l) = 0, which correspond to a string clamped 
at both ends. Now, to construct our Green’s function, we need solutions of the 
homogeneous Sturm-Liouville equation, J ¥y(x) = 0, which is /'(x) = 0. To 
satisfy the boundary conditions, we must have one solution vanish at x = 0, 
the other at x = 1. Such solutions (unnormalized) are 


u(x) — x, 
v{x) = 1 — x. 

We find that 


(16.145) 


uv' — vu' = — 1 (16.146) 

or, by Eq. 16.126 with p(x) = 1, A = — 1. Our Green’s function becomes 

[x(l — t), 0 < x < f, 

|/(1 — x), t < x < 1. 

Hence by Eq. 16.137 our clamped vibrating string satisfies 


G(x, t ) — 


(16.147) 


y{x) = A 


G(x, t)y(t)dt. 


(16.148) 


This is Eq. 16.34 with 6=1 and eo 2 = A. 
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s 

t \ 



X\ 


FIG. 16.3 A linear oscillator Green’s 
function 


x = t x = 1 

The reader may show that the known solutions of Eq. 16.144. 

y = sin mix, k — n 2 n 2 , 

do indeed satisfy Eq. 16.148. Note that our eigenvalue k is not the wavelength, 


Green's Function and the Dirac Delta 
Function 

One more approach to the Green’s function may shed additional light on 
our formulation and particularly on its relation to physical problems. Let us 
refer once more to Poisson’s equation, this time for a point charge 


V 2 <p(r) = - 


Ppoint 

£ 0 


(16.149) 


The Green’s function solution of this equation was developed in Section 8.7. 
This time let us take a one-dimensional analog 


&y{x) + = o. 


(16.150) 


Here /(x) poiru refers to a unit point “charge” or a point force. We may represent it 
by a number of forms, but perhaps the most convenient is 

1 

— , t — £ < X < t + £, 

/(x)po«t= 2e (16.151) 

0, elsewhere, 

which is essentially the same as Eq. 8.108. Then, integrating Eq. 16.150, we have 


<Sfy(x)dx — — 


/(*) point d* 


= -1 

from the definition of j\x). Let us examine ^y(x) more closely. We have 
" r+£ d 


(16.152) 


dx 


[ p(x)j/(x)] dx + q(x)y(x) dx 


= |p(xM*)|£ + q(x)y(x)dx = - 1. 


(16.153) 


In the limit s -+ 0 we may satisfy this relation by permitting j/(x) to have a 
discontinuity of — 1 /p(x) at x = t, y(x) itself remaining continuous. 9 These, 


9 The functions p(x ) and q{x ) appearing in the operator if are continuous 
functions. With y(:t) remaining continuous. J q(x)y{x)dx is certainly continu- 
ous. Hence this integral over an interval 2s (Eq. 16. 153) vanishes as s vanishes. 
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however, are just the properties used to define our Green’s function, G(xJ). 
In addition, we note that in the limit £ 0 


/Mpoin. = <$(* - t), (16.154) 

in which S(x — t) is our Dirac delta function, defined in this manner in Section 
8.7. Hence Eq. 16.150 has become 


J£G(x, t) = —S(x — t). 


(16.155) 


This is Eq. 8.132, which we exploit for the development of Green’s functions 
in two and three dimensions — Section 16.6. It will be recalled that we used 
this relation in Section 8.7 to determine our Green’s functions. 

Equation 16.155 could have been expected since it is actually a consequence 
of our differential equation, Eq. 16.117, and Green’s function integral solution, 
Eq. 16.122. If we let (subscript to emphasize that it operates on the 
x-dependence) operate on both sides of Eq. 16.122, then 


& x y(x) = se x 


G(x, t)j\t)dt . 


J a 


By Eq. 16.117 the left-hand side is just —f(x). On the right is independent 
of the variable of integration t, so we may write 


-f(x)= f {& x G(x,t)}f(t)dt. 

J a 

By definition of Dirac delta function, Eqs. 8.107 and 8.117, we have Eq. 16.155. 


EXERCISES 


16 . 5.1 Show that G(x,t ) = < 

(f, t < x < 1. 

is the Green’s function for the operator jSf = d 2 /dx 2 and the boundary condi- 
tions 


16 . 5.2 


y( 0) = o, 

/(l) = 0. 


Find the Green’s function for 

(a) ^y(x) = + y(x), 

(b) <£y(x) = - ^(x), 

dx 


jy(0) = o, 
l/(i) - o, 

y(x) finite for — oo < x < oo. 


16 . 5.3 


Find the Green’s function for the operators 


(a) 


&y(x) = 


dx 



ANS. (a) G(x, t) 


— In f, 

— lnx, 


0 < x < t, 
t < x < 1, 
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(b) jS?y(x) = 


dx 


, <ly(x) \ n‘ 
dx J x 


yM, 


with y(0) finite and y(l) = 0. 

(b) G(x,r) = 


1 

7* 

1 

% 

'"a 

1 

2 n 

L VJ J 

1 

( t V 


- - (xtf 

2 n 

LW J 


0 < x <t, 
t < x < 1. 


The combination of operator and interval specified in Exercise 16.5.3(a) is 
pathological in that one of the end points of the interval (zero) is a singular 
point of the operator. As a consequence, the integrated part (the surface integral 
of Green’s theorem) does not vanish. The next four exercises explore this situa- 
tion. 


1 6 . 5.4 (a) Show that the particular solution of 


dx |_ dx 


-1 


y p (x) = -X j= \ G(x, t)( — l)dt. 


16 . 5.5 


is y P (x) = — x. 
(b) Show that 


where G(x,r) is the Green’s function of Exercise 16.5.3(a). 

Show that Green’s theorem, Eq. 1.97 in one dimension with a Sturm-Liouville 

type operator -~-p(t)— replacing V • V, may be rewritten as 
dt dt 




dt 


dt J 


w(0p(0^ ~ v(t)p(t) 


dt 

du(t) 


dt 


dt 


1 6 . 5.6 Using the one-dimensional form of Green’s theorem of Exercise 16.5.5, let 
v(t) = y(t) and = -/(t) 


u(t) — G(x, t) and — ( p(t) 


dG(x , t) 
dt 


— S(x — t). 


Show that Green’s theorem yields 


y{x) 






■+ 


G(x,t)p(t)^ - y(t)p(t)jG(x,t) 


16 . 5.7 For p(t) = t. 


y(t) = -t, 

G(x , t) — 


-Inf 0 < x < t 
- In x t < x < I’ 
verify that the integrated part does not vanish. 
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16.5.8 


Construct the Green’s function for 

x 2 ^j~ + x~~ + (k 2 x 2 - 1 )y = 0, 
ax ax 

subject to the boundary conditions 

y( o) = o, 
y( i) - o. 


16.5.9 


Given 


fj 2 yj 

J? = (l- X 2 )^~ 2 - 2xy 
dx dx 


and 


G( ± 1, t) remains finite. 

Show that no Green’s function can be constructed by the techniques of this 
section. (w(x) and t>(x) are linearly dependent.) 


16.5.10 Construct the infinite one-dimensional Green’s function for the Helmholtz 
equation 


(V 2 + k 2 W(x) = g(x). 


The boundary conditions are those for a wave advancing in the positive x 
direction — assuming a time dependence e~ io)t . 

ANS. G(x 1 ,x 2 ) = —expdTclxj — x 2 |). 


16.5.11 Construct the infinite one-dimensional Green’s function for the modified 
Helmholtz equation 

(\ 2 -k 2 W(x) = f(x). 

The boundary conditions are that the Green’s function must vanish for x -> oo 
and x — ► — cc . 

ANS. G(x 1 ,x 2 ) = -^exp(— /c|xj — x 2 |). 


1 6.5.1 2 From the eigenfunction expansion of the Green’s function show that 


(a) 



sin nnx sin nnt 


x(l - t), 
/(I - x), 


0 < x < t, 
t < x < 1. 


(b) 


K 2 n=0 


y sin(n + j)kx sin(n + j)nt 
L T . h2 


(* + i ) 2 


0 < x < t, 
t < x < 1. 


Note. In Section 9.4 the Green’s function of if + X is expanded in eigenfunctions. 
The A there is an adjustable parameter, not an eigenvalue. 


1 6.5.1 3 In the Fredholm equation, 


fix) 


= X 2 J G(x, 


t)(p(t) dt, 


G(x, t) is a Green’s function given by 


G(x, t) 


y <Pn(x)<Pn(t) 
L ;2 __ ;2 * 

n=l A„ A 
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Show that the solution is 


°0 ;2 _ ; 2 

cp(x)= X j j 


f(t)<p„(t)dt. 


1 6.5.1 4 Show that the Green’s function integral transform operator 


G(x, t)[ j dt 


is equal to — & 1 in the sense that 


(a) & x 


G(x,t)y(t)dt — — >’(x), 


(b) 


C b 

G(x , t)se t 

Ja 


t)y,y(t)dt = -y(x). 


Note . Take J?y(x) -f f(x) = 0, Eq. 16.117. 


16.6 GREEN'S FUNCTIONS— TWO AND THREE 
DIMENSIONS 

As in the preceding section (and in Section 8.7), we consider a nonhomo- 
geneous differential equation 

&y(ri) = — /( r i )* (16.156) 

We seek a solution that might be represented by 

y( ri )= (16.156a) 

It might be expected that with a differential operator, the inverse operator 
will involve integration. To proceed further, we define the Green’s function 
corresponding to the differential operator as a solution of the point source 
nonhomogeneous equation 1 

&iG(Ti,r 2 )= ~S(r i — r 2 ), (16.1566) 

which satisfies the required boundary conditions. Here the subscript 1 on <£ 
emphasizes that JSf operates on r l . 

Let us assume that is a self-adjoint differential operator of the general 
form 2 


•[p(r,)V,J + q(r , ). (16.156c) 

Then, as a simple generalization of Green’s theorem, Eq. 1.97, we have 


{VJ#2 U ~ 2 V )dx 2 = P(V\2 U ~ 


(16.156t/) 


1 This equation appears in different forms in different references. Some authors 

write the right-hand side as — — r 2 ), others use +<>(r l — r 2 ). As stressed 

in Section 8.7, the delta function will be part of an integrand. 

2 may be in 1, 2, or 3 dimensions (with appropriate interpretation of V,). 
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in which all quantities have r 2 as their argument. (To verify Eq. 16.156J, take 
the divergence of the integrand of the surface integral.) We let u(r 2 ) = y(r 2 ) 
so that Eq. 16.156 applies and v(r 2 ) = G(rj ,r 2 ) so that Eq. 16.1566 applies. 
(Remember G(r t , r 2 ) = G(r 2 , r x ). Section 8.7.) Substituting into Green’s theorem 


{ - G(rj , r 2 )/(r 2 ) + - r 2 )} dt 2 


I 


= P(r2){G(r 1 ,r 2 )V 2 .);(r 2 ) - y(r 2 )\ 2 G(r l ,r 2 )}-do 2 


Integrating over the Dirac delta function 


(16.156c) 


= ( 
*. 


y(*i) = G ( r i > r 2 )f(r 2 ) dx 2 + p(r 2 ){G(r 1 , r 2 )\ 2 y( r 2 ) - y(r 2 )V 2 G(r t , r 2 )} • da 2 


(16.156/) 

Our solution to Eq. 16.156 appears as a volume integral plus a surface integral. 
If y and G both satisfy Dirichlet boundary conditions, or if both satisfy Neumann 
boundary conditions, the surface integral vanishes and we regain Eq. 16.122. 
The volume integral is a weighted integral over the source term /( r 2 ) with our 
Green’s function G(r 1 ,r 2 ) as the weighting function. 


Form of Green's Functions 

For the special case of p( r x ) = 1 and q(r x ) = 0, if is V 2 , the Laplacian. 
Let us integrate 


V?G(r 1 ; r 2 ) = -<%! - r 2 ) 
over a small volume including the point source. Then 


Vi • \ 1 G(r 1 ,i 2 )dT 1 = - J^( r i “ r i)^i 
= - 1 . 


(16.157) 


(16.157a) 


The volume integral on the left may be transformed by Gauss’s theorem as in 
the development of Gauss’s law — Section 1.14. We find that 

JVjGtr^Ma! = -1. (16.158) 

This shows, incidentally, that it may not be possible to impose a Neumann 
boundary condition, that the normal derivative of the Green’s function, dG/dn, 
vanishes over the entire surface. 

If we are in three-dimensional space, Eq. 16.158 is satisfied by taking 

G(r, , r 2 ) = -/• | r ^ r i 2 = | r i — r 2 |- (16.158a) 

The integration is over the surface of a sphere centered at r 2 . The integral 
of Eq. 16.158a is 
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G(r 1; r 2 ) = 


4n 


(16.159) 


in agreement with Section 1.14. 

If we are in two-dimensional space Eq. 16.158 is satisfied by taking 


■~—G(pi,p 2 ) 

dp 12 


l 1 

2n |pi - p 2 


(16.160) 


with r being replaced by p, p = (x 2 + y 2 ) 1/2 and the integration being over the 
circumference of a circle centered on p 2 . Here p 12 = |p t — p 2 |. Integrating Eq. 
16.160, we obtain 


G(pi,p 2 ) = ~~-ln|pi ~p 2 |- (16.161) 

2n 

To G( p 1? p 2 ) (and to G(ri,r 2 )) we may add any multiple of the regular 
solution of the homogeneous equation as needed to satisfy boundary conditions. 

The behavior of the Laplace operator Green’s function in the vicinity of the 
source pointr x = r 2 shown by Eqs. 16.159 and 16.161 facilitates the identification 
of the Green’s functions for the other cases, such as the Helmholtz and modified 
Helmholtz equations. 

1. For r x r 2 , G(r x ,r 2 ) must satisfy the homogeneous 
differential equation 


&iG( r 1? r 2 ) = 0, GfG. (16.162) 
2. As r : -► r 2 (or p A ^p 2 ), 


G (PnPi) 


2n 


l«|pi -P 2 |, 


G(r 1( r 2 ) 


4n 



two-dimensional 

space, 

(16.163) 

three-dimensional 

space. 

(16.163a) 


The term ±k 2 in the operator does not affect the behavior of G near the singular 
point r x = r 2 . For convenience, the Green’s functions for the Laplace, Helm- 
holtz, and modified Helmholtz operators are listed in Table 16.1. 


Spherical Polar Coordinate Expansion 
As an alternate determination of the Green’s function of the Laplace operator, 
let us assume a spherical harmonic expansion of the form 


G(ri,r 2 ) = 


OD 


I 

1 = 0 


m=-l 


We will determine gi(r 1 ,r 2 ). From Exercises 8.6.7 and 12.6.6 


(16.164) 
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TABLE 16.1 Green’s Functions" 


One-dimensional space 
Two-dimensional space 
Three-dimensional space 


Laplace 

V 2 


No solution for 
(—00, 00 ) 

-^ ln lpl-p 2 
J 1 

471 h - rj 


Helmholtz 

V 2 + A 2 

~exp(//f|.v, - .v 2 |) 

— Pal) 

exp( /A|r , - r 2 [) 
4rc|r, - r 2 | 


Modified 

Helmholtz 

V 2 - k 2 

-texpC-A lx, - x 2 |) 

-U„(A| Pl -p 2 |) 
exp(- A|r, - r 2 |) 

_ 4 Jtjr, - r 2 | 


“These are the Green’s functions satisfying the boundary condition (/(i^ , r 2 ) — 0 as r t -* oo 
for the Laplace and modified Helmholtz operators. For the Helmholtz operator, G{r i ,r 2 ) 
corresponds to an outgoing wave. H^ l) is the Hankel function of Section 11.4. K 0 is the 
modified Bessel function of Section 11.5. 


<5(r, - r 2 ) = <5(rj - r 2 )<5(cos(l, - cos 0 2 )6{<p t - q> 2 ) 


i oo / 

= - r 2 ) x x vruh , <p t ) Yr*(o 2 , 

r i 1 = 0 m= -/ 


(16.165) 


Substituting Eqs. 16.164 and 16.165 into the Green’s function differential equa- 
tion, Eq. 16.157, and making use of the orthogonality of the spherical harmonics, 
we obtain a radial equation: 

dr\ ] 


r iyrl/i0!( r i’ r 2)] - K 1 + l)9i(ri,r 2 ) = -<K r, - r 2 ). 


(16.166) 


This is now a one-dimensional problem. The solutions 3 of the corresponding 
homogeneous equation are r[ and If we demand that g l remain finite as 
r j — ► 0 and vanish as r i -> oo, the technique of Section 16.5 leads to 


9Ar X9 r 2 ) = 


or 

9i(ri,r 2 ) = 

Hence our Green’s function is 

oo / 

G(r 1 ; r 2 )=X X 


1 


21 + 1 


1 


21 + 1 r 


'2 


J + 1 ’ 


/+ 1 * 


r l < r 2 . 

O > ^ 


(16.167) 


(16.168) 


— 4l , <Pi) yr(0 2 , (Pi)- (16. 169a) 


i=o 2/ + 1 r 

Since we already have G(r 1 ,r 2 ) in closed form, Eq. 16.159, we may write 


Compare Table 8.1. 
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4n 


1 



00 l 


= X X 

i = n = - 


l ^ 

21 + 1 r> +1 




(16.1696) 


One immediate use for this spherical harmonic expansion of the Green’s 
function is in the development of an electrostatic multipole expansion. The 
potential for an arbitrary charge distribution is 


M*i) = 


1 


47C£ 0 


P( G) 


dx-> 


(which is Eq. 8.81). Substituting Eq. 16.1696, we get 
1 g(g ofl) 


1 00 l 

I E v 2 / , j 

£ 0 1=0 m=-l [ Zl + 1 

for r x > r 2 . 


r[ +1 


p(r 2 ) Y"'*(0 2 , (p 2 )r' 2 d(p 2 sin U 2 d0 2 r 2 2 dr 


2?’ 


This is the multipole expansion. The relative importance of the various terms 
in the double sum depends on the form of the source p(r 2 ). 


ri 


r 



FIG. 16.4 

Legendre Polynomial Addition Theorem 

From the generating expression for Legendre polynomials, Eq. 12. 4a, 

in (i6i7o) 

where y is the angle included between vectors r x and r 2 , Fig. 16.4. Equating 
Eqs. 16.169 and 16.170, we have the Legendre polynomial addition theorem 

A n l 

p i(oosy) = -~ X W 1 ,<p 1 )Tr(02,<P2). (16.171) 

Z/ ' 1 m= -l 
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Compare the simplicity (once Green’s functions are understood) of this deriva- 
tion with the relatively cumbersome derivation of Section 12.8. 

Circular Cylindrical Coordinate Expansion 

In analogy with the preceding spherical polar coordinate expansion, we 
write 


<5(ri - r 2 ) = — <5(Pi - p 2 )d(<Pi ~ <Pi)d(z i ~ * 2 ) 

Pi 


= —S (Pi-p 2 )^- £ e^^dk, 

Pi 271 m— cc 2n 


(16.172) 


using Exercise 12.6.5 and Eq. 15.21 d. But why this choice? Why a summation 
for the (^-dependence and an integration for the z-dependence? The requirement 
that the azimuthal dependence be single-valued quantizes m, hence the sum- 
mation. No such restriction is expected on k . 

To avoid problems later with negative values of /c, we rewrite Eq. 16.172 as 

1 i oo if 00 

- r 2 ) = —d{p l - p 2 )— £ -” 2 >_ cos/c(z, - z 2 )dk. 

Pi 2% m ~ — cc 71 Jq 

(16.172a) 

using the Cauchy principal value. We assume a similar expansion of the Green’s 

i 00 . r°° 

G(r l ,r 2 ) = —^ £ 9m(Pi’P2)e im(<f ’ , ~‘ l ’ 2) \ cos^Zj - z 2 )dk, (16.173) 

271 »=-« Jo 

with the p-dependent coefficients g m (Pi,p 2 ) 1° be determined. Substituting into 
Eq. 16.157, now in circular cylindrical coordinates, we find that if g m {p\,P 2 ) 
satisfies 


d 

dp i 


dp i_ 


k 2 pi + 


m 

Pi 


g m = -<5(Pi - P 2 X 


(16.174) 


then Eq. 16.157 is satisfied. 

The operator in Eq. 16.174 is identified as the modified Bessel operator (in 
self-adjoint form). Hence the solutions of the corresponding homogeneous 
equation are u x = / m (/cp), u 2 = K m (kp). As in the spherical polar coordinate 
case, we demand that G be finite at p 1 — 0 and vanish as p r -► oo. Then the 
technique of Section 16.5 yields 


gmiPi’Pi) 


-J L(k P< )KJk P> ). 


(16.175) 


This corresponds to Eq. 16.128. The constant A comes from the Wronskian: 


i m (kp)K' m {kp) - i;„(kp)Kjkp) = 


A 

p(M’ 


(16.175a) 


From Exercise 11.5.10 A = — 1 and 
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9 m (Pi,Pi) = I,nO<P < )KJkp > ). 

Therefore our circular cylindrical coordinate Green’s function is 
1 1 


(16.176) 


G(r 1; r 2 ) = 


4n |r 2 — r 2 | 

00 (*oo 


1 00 

- 2 ? 2 


L^p^K^kp^e 11 ”^' ^'cos k(zi — z 2 )dk. 


0 


(16.177) 


Exercise 16.6.14 is a special case of this result. 


EXAMPLE 16.6.1 Quantum Mechanical Scattering — Neumann Series 
Solution 


The quantum theory of scattering provides a nice illustration of integral 
equation techniques and an application of a Green's function. Our physical 
picture of scattering is as follows. A beam of particles moves along the negative 
z-axis toward the origin. A small fraction of the particles is scattered by the 
potential V(r) and goes off as an outgoing spherical wave. Our wave function 
^(r) must satisfy the time-independent Schrodinger equation 

-L-VV(r) + F(i#(r) = Ei/dr) (16.178a) 


or 


with 


\ 2 \j/(r) + k 2 ij/{ r) 


2m 


V(iM r) 


(16.1786) 


k 2 = ImE/h 2 . 

From the physical picture just presented we look for a solution having an 
asymptotic form 

iP(r)~e‘ k °"+ (16.179) 

Here e lk ° r is the incident plane wave 4 with k 0 the propagation vector carrying 
the subscript 0 to indicate that it is in the 6 = 0 (z-axis) direction. The magni- 
tudes k 0 and k are equal. e lkr /r is the outgoing spherical wave with an angular- 
(and energy) dependent amplitude factor f k (6 , cp) 5 Vector k has the direction of 
the outgoing scattered wave. In quantum mechanics texts it is shown that the 


4 For simplicity we assume a continuous incident beam. In a more sophis- 
ticated and more realistic treatment Eq. 16.179 would be one component of 
a Fourier wave packet. 

5 If F(r) represents a central force, f k will be a function of 0 only, independent 
of azimuth. 
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differential probability of scattering, da/dQ , the scattering cross section per 
unit solid angle, is given by \f k (0, cp)\ 2 . 

2m 

V 


Identifying 


-^JW(r) 


<A(ri) = 


with /( r) of Eq. 16.156, we have 

lyyj 

— 2 -F(r 2 )^(r 2 )G(r,,r 2 VV 2 


(16.180) 


by Eq. 16.156/. This does not have the desired asymptotic form Eq. 16.179, 
but we may add to Eq. 16.180 e iko ' r ‘, a solution of the homogeneous equation 
and put i j/(r) into the desired form: 


Mr) = e ,k °' r ' 



V(r 2 )il/(T 2 )G(T i ,r 2 )d 3 r 2 , 


(16.181) 


Our Green’s function is the Green’s function of the operator if = V 2 + k 2 
(Eq. 16.1786) satisfying the boundary condition that it describe an outgoing 
wave. Then, from Table 16.1, G(r 1( r 2 ) = exp^f! - r 2 |)/(47i|r, - r 2 |) and 


«A(ri) = e‘ 


,/k 0 Ti _ 


2m 

IS 


v(t 2 m 2 ) 


— r 2 | 


4n\r x — \ 


r d 3 r? 


(16.182) 


This integral equation analog of the original Schrodinger wave equation is 
exact . 

Employing the Neumann series technique of Section 16.3 (remember, the 
scattering probability is very small), we have 


^(G) = e ik »-% (16.183a) 

which has the physical interpretation of no scattering. 

Substituting \j/ 0 {r 2 ) = e iK '' ! into the integral, we obtain the first correction 
term 


<Mr,) = e' k “' r ' 


1 


2m . 
TT Ur 2 ) 


e ik \ T i- T i\ 

M r i ~ Gl 


e * 2 d^r 2 ■ 


(16.183 b) 


This is the famous Born approximation. It is expected to be most accurate for 
weak potentials and high incident energy. If a more accurate approximation 
is desired the Neumann series may be continued. 6 


EXAMPLE 16.6.2 Quantum Mechanical Scattering — Green’s Function 

Again, we consider the Schrodinger wave equation (Eq. 16.178/?) for the 
scattering problem. This time we use Fourier transform techniques and derive 
the desired form of the Green’s function by contour integration. Substituting 
the desired asymptotic form of the solution (with k replaced by k 0 ) 

<A(r) ~ e ik » 2 + f k J6, (p)^y- = e ik ° z + <J>(r) (16.179a) 


6 This assumes the Neumann series is convergent. In some physical situations 
it is not convergent and then other techniques are needed. 
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into the Schrodinger wave equation, Eq. 16.1786, yields 
(V 2 + /c 2 )<E>( r) - U(r)e lk ° z -f U( r)O(r). 


Here 



F(r), 


(16.184a) 


the scattering (perturbing) potential. Since the probability of scattering is 
much less than one, the second term on the right-hand side of Eq. 16.184a is 
expected to be negligible (relative to the first term on the right-hand side) and 
thus we drop it. Note that we are approximating our differential equation with 

(V 2 + fco)O(r) = U (r)e ik ° = . (16.184 6) 

We now proceed to solve Eq. 16.1846, a nonhomogeneous differential equa- 
tion. The differential operator V 2 generates a continuous set of eigenfunctions 

V 2 »K(r)= -fc 2 <Mr), (16.185) 

where 

Ur) = (27rr 3/ V kr . 

These eigenfunctions form a continuous but orthonormal set in the sense that 
| KftWk 2 (r)d 3 r = <5(k, - k 2 ) 

(compare Eq. 15.216?). 7 We use these eigenfunctions to derive a Green’s function. 
We expand the unknown function ^(r^ in these eigenfunctions, 

®( r i) = |^k>k 1 ( r ,)^ 3 ^ 1 (16.186) 

a Fourier integral with A k the unknown coefficients. Substituting Eq. 16.186 
into Eq. 16.1846 and using Eq. 16.185, we obtain 

A k {k 2 0 - k 2 )\p k (r)d 3 k = U(x)e ik « s . (16.187) 

Using the now familiar technique of multiplying by ij/£( r) and integrating over 
the space coordinates, we have 


(16.188) 


- k 2 )d 3 k i I il/? 2 (r)i/j ki (r)d 3 r = A^k 2 - k 2 ) 

( r ) U (r)e iko= d 3 r. 

Solving for A kj and substituting into Eq. 16.186, we have 

®(r 2 )= I l(k 2 o - klr' \U(r l )U(r l )e ik « z <d i r^U(r 2 )d 3 k 2 . (16.189) 


1 d 3 r = dxdydz, a (three-dimensional) volume element in r-space. 
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Hence 


®(ri) = j iMrxMfcg ~ kf)" 1 d 3 k 1 J ^*(r 2 )H(r 2 )^^ 2rf 3 r2> (16 . 190) 

replacing k 2 by k x and r x by r 2 to agree with Eq. 16.186. Reversing the order 
of integration, we have 


®(r,) = - 




where G k (r 1 ,r 2 ), our Green’s function, is given by 


GJr 1; r 2 ) = 


km 2 ) am,) j3 


k\-ki 


d 3 k , 


(16.191) 


(16.192) 


analogous to Eq. 9.91 of Section *9.4 for discrete eigenfunctions. Equation 16.191 
should be compared with the Green’s function solution of Poisson’s equation 
(16.116). 

It is perhaps worth evaluating this integral to emphasize once more the vital 
role played by the boundary conditions. Using the eigenfunctions from Eq. 
16.185 and 


d 2 k = k 2 dk sin 0 dOdcp, 


(16.193) 


we obtain 


G*„(r,,r 2 ) = 


(2nf 


00 ^ *n f2n gikpcos 6 


k 2 - k 2 
o Jo Jo K K ° 


dxp sin 0dOk 2 dk. (16.194) 


Here kp cos 0 has replaced k-{r 1 — r 2 ), with p = r x — r 2 indicating the polar 
axis in k- space. Integrating over cp by inspection, we pick up a 2n. The 0-integra- 
tion then leads to 


G*>i,r 2 ) = 


An pi 


(e 


ikp 


k 2 -k 


ikp\ 

2 — - kdk, 


and since the integrand is an even function of k, we may set 


(16.195) 


AM !’ r 2) 


1 

%n 2 pi 



(gfK ~ ^ KdK 

^ #V tf iV, 

K — <7 


(16.196) 


The latter step is taken in anticipation of the evaluation of G k (r x , r 2 ) as a contour 
integral. The symbols k and o (a > 0) represent kp and /c 0 p, respectively. 

If the integral in Eq. 16.196 is interpreted as a Riemann integral, the integral 
does not exist. This implies that does not exist, and in a literal sense it 
does not. — V 2 + k 2 is singular since there exist nontrivial solutions \j/ for 
which the homogeneous equation = 0 (compare Exercise 4.6.6). We avoid 
this problem by introducing a parameter y, defining a different operator 
and taking the limit as y -* 0. 

Splitting the integral into two parts so each part may be written as a suitable 
contour integral gives us 
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C(r 1 »r 2 ) 


1 f Ke lK dK if K€ lK dK 

8n 2 pi J c k 2 — a 2 Sn 2 pi J c k 2 — a 2 


(16.197) 


Contour C x is closed by a semicircle in the upper half-plane, C 2 by a semicircle 
in the lower half-plane. These integrals were evaluated in Chapter 7 by using 
appropriately chosen infinitesimal semicircles to go around the singular points 
k = ± < 7 . As an alternative procedure, let us first displace the singular points 
from the real axis by replacing o by a + iy and then, after evaluation, taking the 
limit as y 0 (Fig. 16.5). 




FIG. 16.5 Possible Green’s func- 
tion contours of integration 


For y positive , contour C x encloses the singular point k = o + iy and the 
first integral contributes 


2ni 




From the second integral we also obtain 

2m-%e i{ " +iy \ 

the enclosed singularity being k= — (o + iy). Returning to Eq. 16.197 and 
letting y — ► 0, we have 


G(r 1; r 2 ) 



oK-^t 

47c|ri - r 2 |’ 


(16.198) 


in full agreement with Exercise 8.7.16. This result depends on starting with y 
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positive. Had we chosen y negative, our Green’s function would have included 
e ~ ia , which corresponds to an incoming wave. The choice of positive y is dictated 
by the boundary conditions we wish to satisfy. 

Equations 16.191 and 16.198 reproduce the scattered wave in Eq. 16.183b 
and constitute an exact solution of the approximate Eq. 16.184b. Exercises 
16.6.18 to 16.6.20 extend these results. 


EXERCISES 


16 . 6.1 Verify Eq. 16.1564, 

16 . 6.2 Show that the terms- + /c 2 in the Helmholtz operator and — k 2 in the modified 
Helmholtz operator do not affect the behavior of G(r!,r 2 ) in the immediate 
vicinity of the singular point r 2 = r 2 . Specifically, show that 


lim 

h-^bo 



G(r,,r 2 )4T 2 


= 0 . 


16 . 6.3 Show that 

exp(f/c|r, - r 2 |) 

47t|n - r 2 1 

satisfies the two appropriate criteria and therefore is a Green’s function for the 
Helmholtz equation. 


16 . 6.4 (a) Find the Green’s function for the three-dimensional Helmholtz equation. 

Exercise 8.7.16, when the wave is a standing wave. 

(b) How is this Green’s function related to the spherical Bessel functions? 


1 6 . 6.5 The homogeneous Helmholtz equation 

\ 2 cp H- X 2 (p = 0 

has eigenvalues Xf and eigenfunctions <p, . Show that the corresponding Green’s 
function that satisfies 


V 2 G(r!,r 2 ) + / 2 G(r 1 ,r 2 ) = -<5^ - r 2 ) 


may be written as 


G(ri,r 2 ) 


l 

i=i 


<P,(ri)<?,(r 2 ) 

X 2 - k 2 


An expansion of this form is called a bilinear expansion. If the Green’s function 
is available in closed form , this provides a means of generating functions. 


16 . 6.6 


An electrostatic potential (mks units) is 


<p(r) = 


Z 

4ns 0 


e 


—ar 


r 


Reconstruct the electrical charge distribution that will produce this potential. 
Note that (p(r) vanishes exponentially for large r, showing that the net charge is 
zero. 
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y a 2 P~ ar 

ANS. p{r) = Zd(r)- - - ■ — 
471 r 


16.6.7 


Transform the differential equation 

ar z r 

and the boundary conditions y(0) = y(oo) = 0 into a Fredholm integral equa- 
tion of the form 


y(r) = l\ G(rA)~-y(t)du 

Jo 1 

The quantities V 0 and k 2 are constants. The differential equation is derived from 
the Schrodinger wave equation with a meson potential. 


G(r,t) = 


1 sinh kr, 0 < r < t, 


k 6 


r sinh kt , 


t < r < oo. 


1 6.6.8 A charged conducting ring of radius a (Example 12.3.3) may be described by 


p( r) = A~ 2 &(r - a) 3 (cos 0). 

2na 

Using the known Green’s function for this system, find the electrostatic poten- 
tial. 

Hint . Exercise 12.6.3 will be helpful. 


16.6.9 


Changing a separation constant from k 2 to —k 2 and putting the discontinuity 
of the first derivative into the z-dependence, show that 



1 00 
— y 

4n m =-oo 


POO 

e imiv ' ~«‘PjJkp l )J m (kp 2 )e- k |z '^ dk. 

0 


Hint . The required S(p x — p 2 ) may be obtained from Exercise 15.1.2. 


16.6.10 Derive the expansion 

exp[ife[r) - r 2 [] = “ (j l (kr l )h\ ll (kr 2 )\ 

47t|ri-r 2 | ,i‘ 0 \j l (kr 2 )h , l i) (kr l )) 

i Yr(o 1 (Pl )Yr(o 2 ,cp 2 ), ri<r2 

m— —i r 2 > r 2 - 

Hint. The left side is a known Green’s function. Assume a spherical harmonic 
expansion and work on the -remaining radial dependence. The spherical har- 
monic closure relation, Exercise 12.6.6, covers the angular dependence. 

1 6.6.1 1 Show that the modified Helmholtz operator Green’s function, exp( — k\r x — r 2 | )/ 
(47r|r 1 — r 2 |) has the spherical polar coordinate expansion 

6XP ! |^ r> — = k £ i,(kr < )k,(kr > ) £ Y^(0 i ,tp l )Y'!'*(0 2 ,(p 2 ). 

^ n \ r l r 2| z = 0 m=-~l 

Note. The modified spherical Bessel functions i,(/cr) and k t (kr) are defined in 
Exercise 11.7.15. 
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1 6.6.1 2 From the spherical Green’s function of Exercise 16.6.10, derive the plane wave 
expansion 

= f i'(2l + l)j,(kr)P, (cosy), 

1=0 

where y is the angle included between k and r. This is the Rayleigh equation of 
Exercise 12.4.7. 

Hint Take r 2 » r x so that 

ii k ' r i 

1*1 - r 2 \ -+r 2 - r 20 Tj = r 2 r-L 


Let r 2 oo and cancel a factor of e ikr2 /r 2 . 

1 6.6.1 3 From the results of Exercises 16.6.10 and 16.6.12, show that 

= f i\2l + 1)/, (x). 

1 = 0 

16.6.14 (a) From the circular cylindrical coordinate expansion of the Laplace Green's 

function (Eq. 16.177), show that 

i 2 r 

I Ko(kp) cos kzdk. 

This same result is obtained directly in Exercise 15.3.11. 

(b) As a special case of part (a) show that 


K 0 (k)dk = 


16.6.15 Noting that 


<Ak(r) = ; 


is an eigenfunction of 


(V 2 + k 2 )ij/ h (r) = 0 


(Eqs. 16.183 and 16.184), show that the infinite Green’s function of & = V 2 
may be expanded as 


4jt|r, - r 2 | (27t) 3 


d 3 k 


16.6.16 Using Fourier transforms, show that the Green’s function satisfying the non- 
homogeneous Helmholtz equation 

(V 2 + kl)G(r u r 2 ) = - r 2 ) 


1 fe 


in agreement with Eq. 16.192. 

1 6.6.1 7 The basic equation of the scalar Kirchhoff diffraction theory is 


iA(ri) = - 


p ikr fp ikr \ 

^-V<A(r 2 )-^(r 2 )V — 1 
r \ r ) 
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where tj/ satisfies the homogeneous Helmholtz equation and r = \r t — r 2 |. 
Derive this equation. Assume that is interior to. the closed surface S 2 . 

Hint. Use Green’s theorem. 


16 . 6.18 The Born approximation for the scattered wave is given by Eq. 1 6. 183b (and 
Eq. 16.191). From the asymptotic form, Eq. 16.179, 


e ikr 2m f 


I'M 


, ifc|r, -r 2 l 


471 r • 




For the scattering potential F(r 2 ) independent of angles and for r» r 2 , show 
that 


m<p) = ~ 


r 2 V(r 2 ) 


sin(|k 0 - k|r 2 ) 


dr , 


Here k 0 is in the 0 = 0 (original z-axis) direction, whereas k is in the (0, <p) 
direction. The magnitudes are equal: |k 0 | = |k|. m is the reduced mass. 

Hint. You have Exercise 16.6.12 to simplify the exponential and Exercise 15.3.20 
to transform the three-dimensional Fourier exponential transform into a one- 
dimensional Fourier sine transform. 


16 . 6.19 


c 

Calculate the scattering amplitude f k (0,(p) for a meson potential V(r) = V 0 . 

car 

Hint. This particular potential permits the Born integral, Exercise 16.6.18 to 
be evaluated as a Laplace transform. 


ANS. 


m<p) = 


2 mV 0 1 

h 2 (x cl 2 + (k 0 - k) 2 


16 . 6.20 The meson potential V(r) = V 0 (e~ ar /oir) may be used to describe the Coulomb 
scattering of two charges q x and q 2 . We let a -> 0 and V 0 -*■ 0 but take the ratio 
F 0 /a to be qiq 2 /4ns 0 . (For Gaussian units omit the 47re 0 -) Show that the dif- 
ferential scattering cross section da/dQ, = |/ fc (tf, <p)| 2 ) is given by 

da_ = /iiiiV— _ L_ _ E = - p — = h ^ 

dQ \47z:f 0 ) 16£ 2 sin 4 (0/2)’ 2m 2m 

It happens (coincidentally) that this Born approximation is in exact agreement 
with both the exact quantum mechanical calculations and the classical Ruther- 
ford calculation. 
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17 CALCULUS OF 
VARIATIONS 


Uses of the Calculus of Variations 

Before plunging into this new and rather different branch of mathematical 
physics, let us summarize some of its uses in both physics and mathematics. 

1. Existing physical theories : 

a. Unification of diverse areas of physics — using 
energy as a key concept. 

b. Convenience in analysis — Lagrange equations, 

Section 17.3. 

c. Convenient introduction of constraints, Section 
17.7. 

2. Starting point for new, complex areas of physics and 
engineering. In general relativity the geodesic is taken 
as the minimum path of a light pulse in curved 
Riemannian space. Variational principles appear in 
modern quantum field theory. Variational principles 
have been applied extensively in modern control 
theory. 

3. Mathematical unification. Variational analysis pro- 
vides a proof of the completeness of the Sturm- 
Liouville eigenfunctions, Chapter 9, and establishes 
a lower bound for the eigenvalues. Similar results 
follow for the eigenvalues and eigenfunctions of the 
Hilbert-Schmidt integral equation, Section 16.4. 

4. Calculation techniques, Section 17.8. Calculation of 
the eigenfunctions and eigenvalues of the Sturm- 
Liouville equation. Integral equation eigenfunctions 
and eigenvalues may be calculated using numerical 
quadrature and matrix techniques, Section 16.3. 


1 7. 1 ONE-DEPENDENT AND ONE-INDEPENDENT 
VARIABLE 

Concept of Variation 

The calculus of variations involves problems in which the quantity to be 
minimized (or maximized) appears as an integral. As the simplest case, let 


925 



926 CALCULUS OF VARIATIONS 


J = j* f(y,y x ,x)dx. (17.1) 

J Xl 

Here J is the quantity that takes on an extreme value. Under the integral sign, 
/ is a known function of the indicated variables y(x), y x (x) = dy(x)/dx and x, 
but the dependence of y on x is not fixed; that is, y(x) is unknown. This means 
that although the integral is from Xj. to x 2 , the exact path of integration is not 
known (Fig. 17.1). 

y 



i FIG. 17. 1 A varied path 

We are to choose the path of integration through points (x { , y x ) and (x 2 , y 2 ) 
to minimize J. Strictly speaking, we determine stationary values of J : minima, 
maxima, or saddle points. In most cases of physical interest the stationary value 
will be a minimum. 

This problem is considerably more difficult than the corresponding problem 
in differential calculus. Indeed, there may be no solution. In differential calculus 
the minimum is determined by comparing y(x 0 ) with y(x), where x ranges over 
neighboring points. Here we assume the existence of an optimum path, that is, 
an acceptable path for which J is stationary, and then compare J for our 
(unknown) optimum path with that obtained from neighboring paths. In Fig. 
17.1 two possible paths are shown. (There are an infinite number of possibilities, 
of course.) The difference between these two for a given x is called the variation 
of y, Sy , and is conveniently described by introducing a new function rj(x) to 
define the arbitrary deformation of the path and a scale factor a to give the 
magnitude of the variation. The function rj{x) is arbitrary except for two re- 
strictions. First, 

*(*i) = nix 2 ) = 0, (17.2) 

which means that all varied paths must pass through the fixed end points. 
Second, as will be seen shortly, t](x) must be differentiable; that is, we may not 
use 


rj(x)=U * = x 0 , 

= 0, x^x 0 , 


( 173 ) 
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but we can choose t](x) to have a form similar to the functions used to represent 
the Dirac delta function (Chapters 8 and 16) so that rj(x) differs from zero only 
over an infinitesimal region. 1 Then, with the path described with a and rj(x ), 

y(x, a) = y(x,0) + ocii(x\ (17.4) 

and 


Sy = y(x , a) - y(x , 0) = arj(x). 


(17.5) 


Let us choose y(x,a = 0) as the unknown path that will minimize J. Then 
y(x, a) describes a neighboring path. In Eq. 17.1 J is now a function 2 of our new 
parameter a: 


J{ a)= f[y(x,a),y x (x,a),x]dx, 

and our condition for an extreme value is that 

= 0 , 


dJ(a) 

8x 


(17.6) 


(17.7) 


a = 0 


analogous to the vanishing of the derivative dy/dx in differential calculus. 

Now the a-dependence of the integral is contained in y(x, a) and y x (x, a) = 
(d/dx)y(x, a). Therefore 3 


8J( a) 
da 


dLdy+dLdy* 

8y da dy x da 


dx. 


From Eq. 17.4 


dy(x , «) 

da 


= >l(x) 


Equation 17.8 becomes 

a/(«) 

da 


dy x {x, oc) = drijx) 
da dx 


df , . df dt](x) 
dy dy x dx 


dx. 


Integrating the second term by parts, we obtain 


dri(x ) df , .df 

-— — dx = 
dx dy x dy x 


1 , ,d df , 

tl\X) ~r~ — — dx. 
dxdy x 


(17.8) 

(17.9) 

(17.10) 

(17.11) 

(17.12) 


The integrated part vanishes by Eq. 17.2 and Eq. 17.11 becomes 


Compare H. Jeffreys, and B. S. Jeffreys, Methods of Mathematical Physics, 
3rd ed. Cambridge: Cambridge University Press (1966), Chapter 10, for a 
more complete discussion of this point. 

2 Technically, / is a functional, depending on the functions y(x, a) and y x (x, a) : 

a)], yfx, a)]. 

3 Note that y and y x are being treated as independent variables. 



928 CALCULUS OF VARIATIONS 



Sf d 8f 

Jx, 

dy dx 8y x 


rj(x)dx = 0. 


(17.13) 


In this form a has been set equai to zero and, in effect, is no longer part of the 
problem. 

Occasionally we will see Eq. 17.13 multiplied by a, which gives 



d 8f , 
-—ISyJx.c, 


dJ 

dot 


a = 0 


= §J = 0 . 


(17.14) 


Since rj(x) is arbitrary (as already discussed), we may choose it to have the same 
sign as the bracketed expression whenever the latter differs from zero. Hence 
the integrand is always nonnegative. Equation 17.13, our condition for the 
existence of a stationary value, can then be satisfied only if the bracketed term 
itself is identically zero. The condition for our stationary value is thus a partial 
differential equation, 4 


d l_±^L = o. 

dy dx dy x 


(17.15) 


known as the Euler equation, which can be expressed in various other forms. 


Alternate Forms of Euler Equations 

One other form (Exercise 17.1.1), which is often useful is 


v 

dx 


-si'-* 


<1 

Sy x 


= o. 


(17.16) 


In problems in which / = f(y,y x ) and x does not appear explicitly, Eq. 17.16 
reduces to 


or 


_d 

dx 



= 0 


(17.17) 


/ - y x 


K. 

i>y x 


= constant. 


(17.18) 


It is clear that Eq. 17.15 or 17.16 must be satisfied for J to take on a stationary 
value, that is, for Eq. 17.14 to be satisfied. Equation 17.15 is necessary, but it 
is by no means sufficient. 5 Courant and Robbins illustrate this very nicely by 


4 It is important to watch the meaning of djdx and d/dx closely. For example, 

i f/=/l>(*)»*], 

dx dx dy dx 

The first term on the right gives the explicit x-dependence. The second term 
gives the implicit x-dependence. 

5 For a discussion of sufficiency conditions and the development of the calculus 
of variations as a part of modern mathematics see G. M. Ewing, Calculus of 
Variations with Applications , Norton, New Y ork (1969). Sufficiency conditions 
are also covered by Sagan (reference listed at the end of this chapter). 
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FIG. 17.2 Stationary paths over a sphere 


considering the distance over a sphere between points on the sphere, A and B 
Fig. 17.2. Path (1), a great circle route, is found from Eq. 17.15. But path (2), 
the remainder of the great circle through points A and £, also satisfies the Euler 
equation. Path (2) is a maximum but only if we demand that it be a great circle 
and then only if we make less than one circuit; that is, path (2) + n complete 
revolutions is also a solution. If the path is not required to be a great circle, 
any deviation from (2) will increase the length. This is hardly the property of 
a local maximum, and that is why it is important to check the properties of 
solution of Eq. 17.15 to see if they satisfy the physical conditions of the given 
problem. 


EXERCISES 


1 7.1 .1 Show the equivalence of the two forms of Euler’s equation: 


and 


dy dx dy x 


df 

dx 



= 0 . 


1 7.1 .2 Derive Euler’s equation by expanding the integrand of 

J{ «)= j 2 f[y(x,u),y x (x,CL),x\dx 

in powers of a, using a Taylor (Maclaurin) expansion with y and y x as the two 
variables (Section 5.6). 

Note. The stationary condition is dJ(oc)/doi = 0, evaluated at a = 0. The terms 
quadratic in cc may be useful in establishing the nature of the stationary solution 
(maximum, minimum, or saddle point). 


1 7.1 .3 Find the Euler equation corresponding to Eq. 17.15 if / = f(y xx ,y x ,y,x). 


ANS. 


— ( 

dx 2 ' 

with 



dx 


+ f = 0 ’ 

dy 


l(Xi) = fl(x 2 ) = 0, 

^(xi) = nAxi) = o. 


17.1 .4 The integrand f{y,y x ,x) of Eq. 17.1 has the form 

f(y,y x ,x) = f t (x,y) + f 2 (x,y)y x . 
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(a) Show that the Euler equation leads to 

dy dx 

(b) What does this imply for the dependence of the integral J upon the choice 
of path? 

1 7.1 .5 Show that the condition that 

J = jf(x,y)dx 

have a stationary value 

(a) leads to f(x,y) independent of y and 

(b) yields no information about any x-dependence. 

We get no (continuous, differentiable) solution. To be a meaningful variational 
problem dependence on y x or higher derivatives is essential. 

Note. The situation will change when constraints are introduced (compare 
Exercise 17.7.7). 


1 7.2 APPLICATIONS Or THE EULER 
EQUATION 

EXAMPLE 17.2.1 Straight Line 

Perhaps the simplest application of the Euler equation is in the determination 
of the shortest distance between two points in the xy-plane. Since the element of 
distance is 


ds = \_(dx) 2 4- (dy) 2 Y /2 = [1 + y 2 ~] 1/2 dx, 
the distance J, may be written as 

ds = I [1 + y 2 Y /2 dx. 


j= r 2,yi ds= r 

Jx 


Comparison with Eq. 17.1 shows that 

f(y,y x ,x) = (1 + ylY' 2 . 
Substituting into Eq. 17.16, we obtain 


d_ 

dx 


1 


(1 +yl) m 


= o 


or 


(17.19) 

(17.20) 


(17.21) 

(17.22) 


— T-rn = C, a constant. (17.23) 

(1 + y l) 112 


This is satisfied by 


a second constant 


and 


>x = a, 


(17.24) 
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y = ax + b, (17.25) 

which is the familiar equation for a straight line. The constants a and b , of course, 
are chosen so that the line passes through the two points (x^yj) and (x 2 ,y 2 )* 
Hence the Euler equation predicts that the shortest 6 distance between two fixed 
points is a straight line. 

The generalization of this in curved four-dimensional space-time leads to the 
important relativity concept, the geodesic. 

EXAMPLE 17.2.2 Soap Film 

As a second illustration (Fig. 17.3), consider a surface of revolution generated 
by revolving a curve y(x) about the x-axis. The curve is required to pass through 
fixed end points (x^yj) and (x 2 ,y 2 ). The variational problem is to choose 
the curve y(x) so that the area of the resulting surface will be a minimum. 


y 



FIG. 17.3 Surface of rotation — soap film problem 
For the element of area shown in Fig. 17.3 

dA = 2nyds = 27ry(l -I- y 2 ) 1/2 dx. 
The variational equation is then 


J =* 


2ny{\ + yl) m dx. 


Neglecting the 2n, we obtain 

f(y,y x ,x) = y(l + y 2 x ) m - 

Since df /dx = 0, we may apply Eq. 17.18 directly and get 


(17.26) 


(17.27) 


(17.28) 


6 Technically, we have a stationary value. From the a 2 terms it can be identified 
as a minimum (Exercise 17.2.1). 
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or 


y( i + yl) m - yyl 


(i +y 2 x ) m 


= c. 


(17.29) 


y = c 

(i + ylY ' 2 *' 

Squaring, we get 

V 2 

= ci with c\ < y 2 min 
and 


(yj 


-1 


dx _ C x 

^ ~ Vy 2 - cf 


This may be integrated to give 


(17.30) 


(17.31) 


(17.32) 


x = c 1 cosh + e 2 . (17.33) 

Solving for y, we have 

y = c x cosh 

and again c l and c 2 are determined by requiring the hyperbolic cosine to pass 
through the points and (x 2 ,y 2 )- Our “minimum” area surface is a 

catenary of revolution or a catenoid. 


(17.34) 


Soap Film — Minimum Area 

This calculus of variations contains many pitfalls for the unwary. (Remember 
the Euler equation is a necessary condition assuming a differentiable solution. 
The sufficiency conditions are quite involved. See the references for details.) 
Perhaps respect for some of these hazards may be developed by considering 
a specific physical problem, for example, the minimum area problem with 
= ( — x 0? 1), (x 2 ,y 2 ) = ( + *o> !)• The minimum surface is a soap film 
stretched over the two rings of unit radius at x = ± x 0 . The problem is to predict 
the curve y(x) assumed by the soap film. 

By referring to Eq. 17.34, we find that c 2 = 0 by the symmetry of the problem. 
Then 

y = c 1 cosh (17.34a) 

If we take x 0 = j, we obtain the transcendental equation for c 1 , 

1 = c, cosh ^ r j. (17.35) 

We find that this equation has two solutions; c x = 0.2350, leading to a “deep” 
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curve, and c 1 = 0.8483, leading to a “flat” curve. Which is our minimum? Which 
curve is assumed by the soap film? Before answering these questions, consider 
the physical situation with the rings moved apart so that x 0 = 1. Then Eq. 
17.34a becomes 

1 = Cj cosh KTy , (17.36) 

which has no real solutions ! The physical significance is that as the unit radius 
rings were moved out from the origin a point was reached at which the soap 
film could no longer maintain the same horizontal force over each vertical 
section. Stable equilibrium was no longer possible. The soap film broke (ir- 
reversible process) and formed a circular film over each ring (with a total area 
of 2n = 6.2832. . .). This is the Goldschmidt discontinuous solution. 



0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 FIG. 17.4 Solutions of Eq. 17.34a for 
xo unit radius rings at x = ±.v 0 

The next question is — how large may x 0 be and still give a real solution 
for Eq. 17.34a? 7 Letting c~[ l = p, Eq. 17.34a becomes, 

p ~ cosh px 0 . (17.37) 

To find x 0 max we could solve for x 0 (as in Eq. 17.33) and then differentiate with 
respect to p. Finally, with an eye on Fig. 17.4, dx 0 /dp would be set equal to zero. 
Alternatively, direct differentiation of Eq. 17.37 with respect to p yields 

1 = sinhpx 0 [x 0 + pdx 0 /dp~\. 

The requirement that dx 0 /dp vanish, leads to 

1 = x 0 sinh px 0 . (17.38) 

Equations 17.37 and 17.38 may be combined to form 


7 From a numerical point of view it is easier to invert the problem. Pick a 
value of c x and solve for x 0 . Equation 17.34a becomes x 0 = c x cosh -1 (1/cJ. 
This has numerical solutions in the range 0 < Cj < 1. 
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px 0 — coth px 0 

(17.39) 

with the root 


px 0 = 1.1997. 

(17.40) 

Substituting into Eqs. 17.37 or 17.38, we obtain 


p = 1.810 Cj = 0.5524 

(17.41) 


and 


0.6627. 


(17.42) 


Returning to the question of the solution of Eq. 17.35 that describes the 
soap film, let us calculate the area corresponding to each solution. We have 


A = 4n 
= 4nc i 


= nc\ 


y( i + yl) x ' 2 dx = — 


° ( cosh^ ) dx 

0 V 


"1 Jo 


y 2 dx (by Eq. 17.30) 


(17.43) 


sinh(7^ 

+ 

! 

X 

1© 

i 

\ c iJ 

c i j 


For x 0 — Eq. 17.35 leads to 


0.2350 -* A = 6.8456, 


c x = 0.8483 —> A = 5.9917, 

showing that the former can at most be only a local minimum. A more detailed 
investigation (compare Bliss, Calculus of Variations , Chapter IV) shows that 
this surface is not even a local minimum. For x 0 = \ the soap film will be de- 
scribed by the flat curve 


^ = 0.8483 cosh ( 0 ~^). (17.44) 

This flat or shallow catenoid (catenary of revolution) will be an absolute 
minimum for 0 < x 0 < 0.528. However, for 0.528 < x < 0.6627 its area is 
greater than that of the Goldschmidt discontinuous solution (6.2832) and it 
is only a relative minimum (Fig. 17.5). 

For an excellent discussion of both the mathematical problems and experi- 
ments with soap films, the reader is referred to Courant and Robbins. 


EXERCISES 

17 . 2.1 A soap film is stretched across the space between two rings of unit radius 
centered at + x 0 on the x-axis and perpendicular to the x-axis. Using the solution 
developed in Section 17.2, set up the transcendental equations for the condition 
that x 0 is such that the area of the curved surface of rotation equals the area of 
the two rings (Goldschmidt discontinuous solution). Solve for x 0 (Fig. 17.6). 



Area 
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0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 


xo 

FIG. 17.5 Catenoid area (unit radius rings at x = ±x 0 ) 


7 



FIG. 17.6 Surface of rotation 

17 . 2.2 In Example 17.2.1 expand /[y(x,a)] — 7[y(x,0)] in powers of a. The term 
linear in a leads to the Euler equation and to the straight-line solution Eq. 
17.25. Investigate the a 2 term and show that the stationary value of J, the 
straight-line distance, is a minimum. 

1 7 . 2.3 (a) Show that the integral 

r x 2 

J = f(y,y x ,x)dx, with / = y(x), 

Jx, 

has no extreme values. 

(b) If f(y,y x ,x) = y 2 (x) find a discontinuous solution similar to the Gold- 
schmidt solution for the soap film problem. 
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17.2.4 Fermat’s principle of optics states that a light ray will follow the path, y(x), 
for which 


f* 2*1 


n(y\x)ds 


is a minimum when n is the index of refraction. For y 2 — = 1, — Xj = x 2 = 1 

find the ray path if 

(a) n — e y , 

(b) n = a(y-y 0 \ y>y 0 - 


17.2.5 


A frictionless particle moves from point A on the surface of the earth to point 
B by sliding through a tunnel. Find the differential equation to be satisfied if 
the transit time is to be a minimum. 

Note. Assume the earth to be nonrotating sphere of uniform density. 

ANS. (Eq. 17.15) r w (r 3 — ra 2 ) 4- r 2 (2a 2 - r 2 ) -4 a z r 2 = 0, 

r(<p = 0) = r 0 . 


(Eq. 17.18) r 2 = 


■ r o 


r < P { ( P = 0 ) - 0 , 
r(<p = (p A ) = a, 
r(<P = </>*) = a. 


The solution of these equations is a hypocycloid, generated by a circle of radius 
j(a — r 0 ) rolling inside the circle of radius a. The student might like to show 
that the transit time is 


t = 71 


(a 2 - rg ) 1/2 

{ag) V2 


For details see P. W. Cooper, Am. J. Phys , 34, 68 (1966); G. Venezian et al., 
Am. J. Phys. 34, 701-704 (1966). 


17.2.6 A ray of light follows a straight-line path in a first homogeneous medium, is 
refracted at an interface, and then follows a new straight-line path in the second 
medium. Use Fermat’s principle of optics to derive Snells law of refraction: 

n 1 sin 0 { = n 2 sind 2 * 

Hint. Keep the points {x 1 ,y 1 ) and (x 2 ,y 2 ) fixed and vary x 0 to satisfy Fermat 
(Fig. 17.7). This is not an Euler equation problem. (The light path is not dif- 
ferentiable at x 0 .) 


17.2.7 A second soap film configuration for the unit radius rings at x = ±x 0 consists 
of a circular disk, radius a, in the x = 0 plane and two catenoids of revolution, 
one joining the disk and each ring. One catenoid may be described by 


y = <4 cosh 



(a) Impose boundary conditions at x = 0 and at x = x 0 . 

(b) Although not necessary, it is convenient to require that the catenoids 
form an angle of 120° where they join the central disk. Express this third 
boundary condition in mathematical terms. 

(c) Show that the total area of catenoids plus central disk is 


A = 




Note. Although this soap film configuration is physically realizable and stable, 
the area is larger than that of the simple catenoid for all ring separations for 
which both films exist. 



GENERALIZATIONS, SEVERAL DEPENDENT VARIABLES 937 



j 1 = c 1 cosh/™ + c 3 

ANS. (a) ) 

( a~c 1 cosh c 3 , 

(b) — tan 30° = sinh c 3 . 

dx 

1 7 . 2.8 For the soap film described in Exercise 17.2.7 find (numerically) the maximum 
value of x 0 . 

Note. This calls for a hand computer with hyperbolic functions or a table of 
hyperbolic cotangents. 

ANS. x 0max - 0.4078. 

17 . 2.9 Find the root of px 0 — cothpx 0 (Eq. 17.39) and determine the corresponding 
values of p and x 0 (Eqs. 17.41 and 42). Calculate your values to five significant 
figures. 

Hint. Try one of the root-determining subroutines listed in Appendix 1. 

17 . 2.10 For the two-ring soap film problem of this section calculate and tabulate x 0 , 
p, p~\ and A, the soap film area for px 0 — 0.00(0.02)1.30. 

1 7 . 2.1 1 Find the value of x 0 (to five significant figures) that leads to a soap film area, 
Eq. 17.43 equal to 2n, the Goldschmidt discontinuous solution. 

ANS. x 0 = 0.52770. 


1 7.3 GENERALIZATIONS, SEVERAL DEPENDENT 
VARIABLES 

Our original variational problem. Equation 17.1, may be generalized in 
several respects. In this section we consider the integrand,/, to be a function 
of several dependent variables, y t (x), y 2 {x), y 3 (x), . . . , all of which depend on x, 
the independent variable. In Section 17.4 / again will contain only one unknown 
function y, but y will be a function of several independent variables (over which 
we integrate). In Section 17.5 these two generalizations are combined. Finally, 



938 CALCULUS OF VARIATIONS 


in Section 17.7 the stationary value is restricted by one or more constraints. 
For more than one dependent variable Eq. 17.1 becomes 


J = 


f f[yi(x),y 2 (x), ■ ■ ■ ,y„(x),y ix {x),y 2x (x), . . .,y nx (x),x]dx. 


(17.45) 


As in Section 17.1, we determine the extreme value of J by comparing 
neighboring paths. Let 

y f (x, a) = y t (x, 0) + a^(x), i = 1, 2, . . . , n, (17.46) 


with the rji independent of one another, but subject to the restrictions discussed 
in Section 17.1. By differentiating Eq. 17.45 with respect to a and setting a = 0, 
since Eq. 17.7 still applies, we obtain 

<i7 - 47) 

the subscript x denoting differentiations with respect to x ; that is, y ix — dy-Jdx , 
and so on. Again, each of the terms (df/8y ix )rj ix is integrated by parts. The 
integrated part vanishes and Eq. 17.47 becomes 



AH) 

dx dy ix ) 


rjt dx = 0. 


(17.48) 


Since the ip are arbitrary and independent of one another, 1 each of the terms in 
the sum must vanish independently . We have 


8f d d[ _ q 

dy t dx d(dyjdx) 


i = 1, 2, . . . , n. 


(17.49) 


a whole set of Euler equations, each of which must be satisfied for an extreme 
value. 


Hamilton's Principle 

The most important application of Eq. 17.45 occurs when the integrand /is 
taken to be the Lagrangian L. The Lagrangian is defined as the difference of 
kinetic and potential energies of a system. 

L = T — V. (17.50) 

Using time as an independent variable instead of x and x,(r) as the dependent 
variables, 

x-+t, 

y, -+ x M), 
y ix -» *,( 0 ; 


1 For example, we could set \] 2 = t] 3 = r} 4 ■ • • = 0, eliminating all but one term 
of the sum, and then treat rj 1 exactly as in Section 17.1. 
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Xi(t ) is the location and x t = dx-Jdt , the velocity of particle i as a function of 
time. The equation SJ = 0 is then a mathematical statement of Hamilton’s 
principle of classical mechanics, 

3 T(x 1 ,x 2 , . . . ,x„,x 1 ,x 2 , . . . ,x n ;t)dt = 0. (17.51) 

In words, Hamilton’s principle asserts that the motion of the system from time 
t 1 tot 2 is such that the time integral of the Lagrangian L has a stationary value. 
The resulting Euler equations are usually called the Lagrangian equations of 
motion, 


A _ Ak. 

dt dx t dx- t 


(17.52) 


These Lagrangian equations can be derived from Newton’s equations of motion 
and Newton’s equations can be derived from Lagrange’s. The two sets of 
equations are equally “fundamental. ” 

The Lagrangian formulation has certain valuable advantages over the 
conventional Newtonian laws. Whereas Newton’s equations are vector equa- 
tions, we see that Lagrange’s equations involve only scalar quantities. The 
coordinates x l9 x 2 , ... need not be any standard set of coordinates or lengths. 
They can be selected to match the conditions of the physical problem. The 
Lagrange equations are invariant with respect to the choice of coordinate 
system. Newton’s equations (in component form) are not invariant. Exercise 
2.5.10 shows what happens to F = ma resolved in spherical polar coordinates. 

Exploiting the concept of energy, we may easily extend the Lagrangian 
formulation from mechanics to diverse fields such as electrical networks and 
acoustical systems. Extensions to electromagnetism appear in the exercises. The 
result is a unity of otherwise separate areas of physics. In the development of 
new areas the quantization of Lagrangian particle mechanics provided a model 
for the quantization of electromagnetic fields and led to the modern theory of 
quantum electrodynamics. 

One of the most valuable advantages of the Hamilton principle — Lagrange 
equation formulation — is the ease in seeing a relation between a symmetry and 
a conservation law. As an example, let x, — cp , an azimuthal angle. If our 
Lagrangian is independent of cp (i.e., cp is an ignorable coordinate), there are 
two consequences: (1) an axial (rotational) symmetry and (2) from Eq. 17.52 
8L/d(p = constant. Physically, this corresponds to the conservation or invar- 
iance of a component of angular momentum. Similarly, invariance under 
translation leads to conservation of linear momentum. Noether’s theorem is a 
generalization of this invariance (symmetry) — the conservation law relation. 


EXAMPLE 17.3.1. Moving Particle — Cartesian Coordinates 


Consider Eq. 17.50 which describes one particle with kinetic energy 

T = \mx 2 


(17.53) 
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and potential energy V(x\ in which, as usual, the force is given by the negative 
gradient of the potential, 

F(x) = (17.54) 

ax 


From Eq. 17.52 

4-(mx) - = m i- F(x) = 0, (17.55) 

at ex 

which is simply Newton’s second law of motion. 


EXAMPLE 17.3.2. Moving Particle — Circular Cylindrical Coordinates 


Now let us describe a moving particle in cylindrical coordinates (z = 0)- 
plane. The kinetic energy is 


T = \m(x 2 + y 2 ) = ^m(p 2 + p 2 <p 2 ), (17.56) 


and we take V = 0. 

The transformation of x 2 + y 2 into circular cylindrical coordinates could be 
carried out by taking x(p,cp) and y(p, <p), Eq. 2.28, and differentiating with 
respect to time and squaring. It is much easier to interpret x 2 + y 2 as v 2 and 
just write down the components of v as p 0 (ds p /dt) = p 0 p, and so on. (The ds p 
is an increment of length , p changing by dp , cp remaining constant. See Sections 
2.1 and 2.4.) 

The Lagrangian equations yield 


— (mp) - mpq) 2 = 0, 


^~{mp 2 (p) = 0 . 


(17.57) 


The second equation is a simple statement of conservation of angular momen- 
tum. The first may be interpreted as radial acceleration 2 equated to centrifugal 
force. In this sense the centrifugal force is a real force. It is of some interest 
that this interpretation of centrifugal force as a real force is supported by the 
general theory of relativity. 


EXERCISES 

1 7 . 3.1 (a) Develop the equations of motion corresponding to L = \ m(x 2 + y 2 ). 

(b) In what sense do your solutions minimize the integral jj 2 Ldt ? 

Compare the result for your solution with x = const., y = const. 

17 . 3.2 From the Lagrangian equations of motion, Eq. 17.52, show that a system in 
stable equilibrium has a minimum potential energy. 


Here is a second method of attacking Exercise 2.4.8. 
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17 . 3.3 Write out the Lagrangian equations of motion of a particle in spherical coordi- 
nates for potential V equal to a constant. Identify the terms corresponding to 
(a) centrifugal force and (b) Coriolis force. 

17 . 3.4 The spherical pendulum consists of a mass on a wire of length /, free to move 
in polar angle 6 and azimuth angle <p (Fig. 17.8). 

(a) Set up the Lagrangian for this physical system. 

(b) Develop the Lagrangian equations of motion. 



1 7 . 3.5 Show that the Lagrangian 

L = moc2 ( 1_ ^/ 1 ^?) _ ‘ /(^) 
leads to a relativistic form of Newton’s second law of motion, 

d / 

dt VV 1 — r 2 /c 2 

in which F t = —dV/dx^ 



17 . 3.6 The Lagrangian for a particle with charge q in an electromagnetic field de- 
scribed by scalar potential (p and vector potential A is 

L — jmv 2 — qcp + qA • v. 

Find the equation of motion of the charged particle. 
d dA- BA- 

Hint . ~~A: = — - 4- ) — -x } . The dependence of the force fields E and B upon 
dt J dt ; dXi 1 F 

the potentials cp and A is developed in Section 1.13 (compare Exercise 1.13.10). 

ANS. mxj = g[E + v x B] y . 


1 7 . 3.7 Consider a system in which the Lagrandian is given by 

MqiAd=nq t Ad-V(q& 

where q { and q t represent sets of variables. The potential energy V is independent 
of velocity and neither T nor V have any explicit time dependence. 

(a) Show that 


d_ 

dt 




= 0 . 


(b) The constant quantity 
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defines the Hamiltonian H. Show that under the preceding assumed con- 
ditions, H = T + V, the total energy. 

Note. The kinetic energy T is a quadratic function of the q- s. 


1 7.4 SEVERAL INDEPENDENT VARIABLES 


Sometimes the integrand / of Eq. 17.1 will contain one unknown function u 
which is a function of several independent variables, u = w(x,y,z), for the three- 
dimensional case. Equation 17.1 becomes 


J = 



u v ,u z . 


x, y, z] dx dy dz , 


(17.58) 


u x indicating du/dx, and so on. The variational problem is to find the function 
u(x,y,z ) for which J is stationary, 

= 0. (17.59) 

a=0 

Generalizing Section 17.1, we let 

u(x,y,z,a) = u(x,y,z, 0) + a rj(x,y,z). (17.60) 

u(x, y, z, a = 0) represents the (unknown) function for which Eq. 17.59 is satisfied, 
whereas again rj(x,y,z) is the arbitrary deviation that describes the varied 
function w(x,y,z,oe). This deviation, rj(x,y,z) is required to be differentiable and 
to vanish at the end points. Then from Eq. 17.60, 


SJ = a 


dJ_ 

da 


u x (x,y,z 9 a) = u x (x,y,z, 0) + arj x , (17.61) 

and similarly for u y and u z . 

Differentiating the integral (Eq. 17.58) with respect to the parameter a and 
then setting a = 0, we obtain 


dJ 


da 


\a=0 





dxdydz = 0. 


(17.62) 


Again, we integrate each of the terms ( df /du^rji by parts. The integrated part 
vanishes at the end points (because the deviation r\ is required to go to zero at 
the end points) and 



<^df_ 

dx du x 


dy du y 


±3L\ 

dz du z J 


r\(x,y,z) dxdydz = 0. 1 


(17.63) 


1 Again, it is imperative that the precise meaning of partial derivatives be 

understood fully. Specifically, in Eq. 17.63 d/dx is a partial derivative, in that 
y and z are constant. But d/dx is also a total derivative in that it acts on implicit 
x-dependence as well as on explicit x-dependence. In this sense 


d 

dx 



8_X 

dx du x 


+ 


du du x x du x 


d 2 f 

du y du x 


7 

du z du : 


u 


l XZ • 
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Since the variation t](x, y, z) is arbitrary, the term in large parenthesis may be set 
equal to zero. This yields the Euler equation for (three) independent variables, 


d l 

du 


d_dj_ 

8x dUr 


dy du 


-ii- 


(17.64) 


EXAMPLE 17.4.1 Laplace’s Equation 


An example of this sort of variational problem is provided by electrostatics. 
The energy of an electrostatic field is 

energy density = ^e£ 2 , (17.65) 

in which E is the usual electrostatic force field. In terms of the static potential cp, 

energy density = ^£(V<p) 2 . (17.66) 

Now let us impose the requirement that the electrostatic energy (associated 
with the field) in a given volume be a minimum. (Boundary conditions on E 
and (p must still be satisfied.) We have the volume integral 2 


With 


J = 


(Vcp) 2 dx dydz 



(<Px + <Py + (pi) dx dydz. 


(17.67) 


f(<p,(p x ,(p y ,(p z ,x,y,z) = (pi + (pl + (pi, (17.68) 

the function (p replacing the u of Eq. 17.64, Euler’s equation (Eq. 17.64) yields 

-2{(p xx + (p yy + (p zz ) = 0 (17.69) 

or 

\ 2 cp(x,y,z) = 0, (17.70) 

which is just Laplace’s equation of electrostatics. 

Closer investigation shows that this stationary value is indeed a minimum. 
Thus the demand that the field energy be minimized leads to Laplace’s equation. 


EXERCISES 

17 . 4.1 The Lagrangian for a vibrating string (small amplitude vibrations) is 

L = J(ipu, 2 -%xu 2 x )dx, 

where p is the (constant) linear mass density and t is the (constant) tension. The 
x-integration is over the length of the string. Show that application of Hamilton’s 


2 Remember that the subscript x indicates the x-partial derivative, not an 
x-component. 
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principle to the Lagrangian density (the integrand), now with two independent 
variables, leads to the classical wave equation 

d 2 u __ p d 2 u 
dx 2 t dt 2 ’ 

17 . 4.2 Show that the stationary value of the total energy of the electrostatic field of 
Example 17.4.1 is a minimum. 

Hint. Use Eq. 17.61 and investigate the a 2 terms. 


17.5 MORE THAN ONE DEPENDENT, MORE THAN 
ONE INDEPENDENT VARIABLE 


In some cases our integrand / contains more than one dependent variable 
and more than one independent variable. Consider 

f = f[p{x,y, z), p x ,Py,p z , q(x, y, z), q x ,q y ,q z> r(x, y, z), r x ,r y ,r z , x, y, z], 

(17.71) 

We proceed as before with 

p(x, y, z, a) = p(x, y, z, 0) + a£(x, y, z), 

q{x, y, z, a) = q(x, y, z, 0) + txrj(x, y, z), (1 7.72) 

r(x,y,z, a) = r(x,y,z, 0) + a£(x,y,z\ and so on. 


Keeping in mind that c, tj, and f are independent of one another, as were the 
t], in Section 17.3, the same differentiation and then integration by parts leads to 


= 0 , 


(17.73) 


sf_ 

dp dx dp x dy dp y dz dp z 

with similar equations for functions q and r. Replacing p, q, r, ... with y, and 
x, y,z, . . . with Xj, we can put Eq. 17.73 in a more compact form: 


Of _ y § 
dy t y dx . 


f\-o. 

Sy L 


i = 1, 2, 


(17.73a) 


in which 


dx/ 


An application of Eq. 17.73 appears in Section 17.7. 


Relation to Physics 

The calculus of variations as developed so far provides a convenient and 
perhaps elegant description of a wide variety of physical phenomena. The 
physics includes ordinary mechanics, Section 17.3; relativistic mechanics, Exer- 
cise 17.3.5; electrostatics, Example 17.4.1; and electromagnetic theory in Exer- 
cise 17.5.1. The convenience and elegance should not be minimized, but at 
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the same time the student should be aware that in these cases the calculus of 
variations has only provided an alternate description of what was already 
known. It has not provided any new physics. 

The situation does change with the challenging and incomplete theories 
of modern particle and field physics. Here the basic physics is not yet known 
and a postulated variational principle can be a useful starting point. 


EXERCISE 


17.5.1 


The Lagrangian (per unit volume) of an electromagnetic field with a charge 
density p is given by 



pq> + py • A. 


Show that Lagrange’s equations lead to two of Maxwell’s equations. (The re- 
maining two are a consequence of the definition of E and B in terms of A and q>.) 
This Lagrangian density comes from a scalar expression in Section 3.7. 

Hint. Take A lf A 2 , A 3 , and q> as dependent variables, x, y, z, and t as independent 
variables. E and B are given in terms of A and (p by Eq. 3.104. 


1 7.6 LAGRANGIAN MULTIPLIERS 


In this section the concept of a constraint is introduced. To simplify the 
treatment, the constraint appears as a simple function rather than as an integral. 
In this section we are not concerned with the calculus of variations, but in 
Section 17.7 the constraints, with our newly developed Lagrangian multipliers, 
are incorporated into the calculus of variations. 

Consider a function of three independent variables, /(x,y,z). For the func- 
tion / to be a maximum (or extreme) 1 


df = 0. 


The necessary and sufficient condition for this is 


in which 


v = v = v =0 

dx dy dz 


(17.74) 


(17.75) 


df =^dx + ^-dy -y~dz. (17.76) 

ex cy dz 

Often in physical problems- the variables x, y, z are subjected to constraints 
so that they are no longer all independent. It is possible, at least in principle, 
to use each constraint to eliminate one variable and to proceed with a new and 
smaller set of independent variables. 


Including a four-dimensional saddle point. 
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The use of Lagrangian multipliers is an alternate technique that may be 
applied when this elimination of variables is inconvenient or undesirable. Let 
our equation of constraint be 

<p(x,y,z) = 0, (17.77) 

from which 


dcp = ~^dx + dz = 0. 

ex dy dz 


(17.78) 


Returning to Eq. 17.74, we see that Eq. 17.75 no longer follows because there 
are now only two independent variables. If we take x and y as these independent 
variables, dz is no longer arbitrary. However, we may add Eq. 17.76 and a 
multiple of Eq. 17.78 to obtain 


df + Xd<p = ( ~ + 

\ ex ex 


dx + 


df , dcp 
dy dy 


dy + 


df_, ,d<p 
dz dz 


Our Lagrangian multiplier X is chosen so that 

dz dz 


dz = 0. 

(17.79) 

(17.80) 


assuming that dep/dz 0. Equation 17.79 now becomes 


(E + ) dX + (fy + k %) dy = °‘ (178,) 

However, we took dx and dy to be arbitrary and the quantities in parentheses 
must vanish, 


dx dx 


= 0 , 


dy + ^ dy 


= 0 . 


(17.82) 


When Eqs. 17.80 and 17.82 are satisfied, df = 0 and /is an extremum. Notice 
that there are now four unknowns : x, y , z, and X. The fourth equation is, of course, 
the constraint (17.77). Actually we want only x, y, and z; X need not be deter- 
mined. For this reason X is sometimes called Lagrange’s undetermined multi- 
plier. This method will fail if all the coefficients of X vanish at the extremum, 
dep/dx, dcpjdy , dep/dz = 0. It is then impossible to solve for X. 

The reader might note that from the form of Eqs. 17.80 and 17.82, we could 
identify / as the function taking an extreme value subject to cp, the constraint 
or identify / as the constraint and cp as the function. 

If we have a set of constraints cp k , then Eqs. 17.80 and 17.82 become 
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with a separate Lagrange multiplier X k for each cp h . 
EXAMPLE 17.6.1 Particle in a Box 


As an example of the use of Lagrangian multipliers, consider the quantum 
mechanical problem of a particle (mass m) in a box. The box is a rectangular 
parallelepiped with sides a , b, and c. The ground state energy of the particle is 
given by 


E 


h^(l_ J_\ 

8m [a 2 + b 2 + c 2 )■ 


(17.83) 


We seek the shape of the box that will minimize the energy £, subject to 
constraint that the volume is constant, 


With f(a, b, c) = 


V(a , b , c) = abc — k. 

E(a , b , c) and cp(a, b, c) = abc — k — 
b 2 


ae ap = 

da A da 


4 ma' 


+ X be 


0, we obtain 

0. 


(17.84) 

(17.85) 


Also, 


h 2 

4 mb 3 


H - Xac — 0, 


h 2 

4 me 3 


+ Xab = 0. 


Multiplying the first of these expressions by a, the second by b, and the third 
by c, we have 


Xabc = 


h 2 

4 ma 2 


h 2 

4 mb 2 


h 2 

4 me 2 * 


(17.86) 


Therefore our solution is 


a = b = c, a cube. (17.87) 

Notice that X has not been determined. It remains an undetermined multiplier. 

EXAMPLE 17.6.2 Cylindrical Nuclear Reactor 


A further example is provided by the nuclear reactor theory. Suppose a 
(thermal) nuclear reactor is to have the shape of a right circular cylinder of 
radius R and height H . Neutron diffusion theory supplies a constraint: 


cp(R,H) = 


2.4048V 
. R ) 



= constant. 2 


(17.88) 


2.4048 ... is the lowest root of Bessel function J 0 {R) (compare Section 11.1). 
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We wish to minimize the volume of the reactor 

f(R,H) = nR 2 H. 

Application of Eq. 17.82 leads to 

df , ,d<p „ nrr , (2.4048) 2 n 
SR +l 3R- 2nRH - 2i -p - 0 ' 

i ; dty _ n o 2 _ j ^ 71 — 0 

By multiplying the first of these equations by R/2 and the second by //, we obtain 

nR 2 H = = (17.91) 

R 2 H z 


(17.89) 


(17.90) 


or 


2.4048 


1.847K. 


(17.92) 


for the minimum volume right -circular cylindrical reactor. 

Strictly speaking, we have found only an extremum. Its identification as a 
minimum follows from a consideration of the original equations. 


EXERCISES 


The following problems are to be solved by using Lagrangian multipliers. 

17.6.1 The ground state energy of a particle in a pillbox (right-circular cylinder) is 
given by 

F f (2.4048) 2 n^\ 

2m V R 2 H 2 )" 

in which R is the radius and H, the height of the pillbox. Find the ratio of R 
to H that will minimize the energy for a fixed volume. 

17.6.2 Find the ratio of ^(radius) to H(height) that will minimize the total surface 
area of a right-circular cylinder of fixed volume. 

17.6.3 The U.S. Post Office limits first class mail to Canada to a total of 36 inches, 
length plus girth. Using a Lagrange multiplier, find the maximum volume and 
the dimensions of a (rectangular parallelepiped) package subject to this con- 
straint. 


1 7.6.4 A thermal nuclear reactor is subject to the constraint 


<p(a, b , c) — 


+ 


+ 


B 2 , 


a constant. 


Find the ratios of the sides of the rectangular parallelepiped reactor of minimum 
volume. 


ANS. a = b = c\ cube. 



EXERCISES 949 


17.6.5 

17.6.6 

17.6.7 

17.6.8 


17.6.9 


17.6.10 


For a simple lens of focal length / the object distance p and the image distance 
q are related by 1/p + 1/q = 1//. 

Find the minimum object-image distance (p + q) for fixed /. Assume real object 
and image (p and q both positive). 

You have an ellipse (x/a) 2 + ( y/b ) 2 = 1. Find the inscribed rectangle of maximum 
area. Show that the ratio of the area of the maximum area rectangle to the area 
of the ellipse is (2 /tt) = 0.6366. 


A rectangular parallelepiped is inscribed in an ellipsoid of semiaxes b , and c. 
Maximize the volume of the inscribed rectangular parallelepiped. Show that 
the ratio of the maximum volume to the volume of the ellipsoid is 2/7e > /3 % 0.367. 


A deformed sphere has a radius given by r = r 0 {a 0 + ol 2 P 2 (cos9)}, where a 0 » 1 
and oc 2 ~ 0- From Exercise 12.5.14 the area and volume are 


A = 4nr 2 0 ul J 1 + - 


4 mj , 
3 


V=^*U\+'- 




Terms of order a 2 have been neglected. 

(a) With the constraint that the enclosed volume be held constant, that is, 
V = 4nrl/3, show that bounding surface of minimum area is a sphere, 
(a 0 = 1, a 2 = 0). 

(b) With the constraint that the area of the bounding surface be held constant ; 
that is, A — 4777o. Show that the enclosed volume is a maximum when the 
surface is a sphere. 


Find the maximum value of the directional derivative of cp (x,y,z), 


dcp 

ds 


d(p dcp n d<p 

-—-cos a -I- — cos p + ~ -cosy, 
ox oy dz 


subject to the constraint 

cos 2 a + cos 2 (1 4- cos 2 y — 1. 

ANS. (f)-IV 

Note concerning the following exercises: 

In a quantum-mechanical system there are g { distinct quantum states be- 
tween energy E { and E { + dE { . The problem is to describe how n { particles are 
distributed among these states subject to two constraints: 

(a) Fixed number of particles: 

X>i = «■ 


(b) Fixed total energy : 

i 


For identical particles obeying the Pauli exclusion principle the probability of 
a given arrangement is 




n 


Qi ■ 


fnMg, - «,)! 
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Show that maximizing W FD subject to a fixed number of particles and fixed 
total energy leads to 


With X t = —E 0 /kT and X 2 = 1 /kT, this yields Fermi-Dirac statistics. 

Hint. Try working with In W and using Stirling’s formula, Section 10.3. The 
justification for differentiation with respect to n { is that we are dealing here with 
a large number of particles, &n i /n i 1. 


17 . 6.11 For identical particles but no restriction on the number in a given state the 
probability of a given arrangement is 


^BE=n 


- + !) ! 


", '-(9, ~ 1)! 


Show that maximizing W BE , subject to a fixed number of particles and fixed 
total energy, leads to 


With X 2 ~ 1 /kT, this yields Bose- Einstein statistics. 
Note. Assume that g t 1. 


17 . 6.12 Photons satisfy kF BE and the constraint that total energy is constant. They 
clearly do not satisfy the fixed number constraint. Show that eliminating the 
fixed number constraint leads to the foregoing result but with X t — 0. 


17.7 VARIATION SUBJECT TO CONSTRAINTS 

As in the preceding sections, we seek the path that will make the integral 

J=s j f { yh to J ,X *) dXj (17 ' 93) 

stationary. This is the general case in which Xj represents a set of independent 
variables and y i9 a set of dependent variables. Again, 

SJ = 0. (17.94) 

Now, however, we introduce one or more constraints. This means that the 
y - s are no longer independent of each other. Not all the rj-s may be varied 
arbitrarily and Eqs. 17.62 or 17.73a would not apply. The constraint may have 
the form 


<Pk(yi,Xj) = 0. (17.95) 

as in Section 17.6. In this case we may multiply by a function of Xj, say, X k {xj) 
and integrate over the same range as in Eq. 17.93 to obtain 

[ hi *M( y.-’ dx i = °- (i 7% ) 


Then clearly 
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5 J ^{x^^x^dxj = 0. (17.97) 

Alternatively, the constraint may appear in the form of an integral 

J cp k (y h dyjdxp Xj) dx } = constant (17.98) 

We may introduce any constant Lagrangian multiplier and again Eq. 17.97 
follows — now with A a constant. 

In either case, by adding Eqs. 17.94 and 17.97, possibly with more than one 
constraint, we obtain 



+ Z^<Pk(y,;Xj) 

k 


dxj = 0. 


(17.99) 


The Lagrangian multiplier X k may depend on Xj when q>{y h x } ) is given in the 
form of Eq. 17.95. 

Treating the entire integrand as a new function 


we obtain 



d ( y -’t' X ^ f + ¥ k(Pk - (17-100) 

If we have Ny ? s (i ~ 1, 2, , N ) and m constraints (k = 1, 2, . . . , m), N — m 

of the rji s may be taken as arbitrary. For the remaining mrj^s, the A’s may, in 
principle, be chosen so that the remaining Euler- Lagrange equations are 
satisfied, completely analogous to Eq. 17.80. The result is that our composite 
function g must satisfy the usual Euler-Lagrange equations 


dg _ y o eg 
dyj y dxj didyjdxj) 


(17.101) 


with one such equation for each dependent variable y { - (compare Eqs. 17.64 
and 17.73). These Euler equations and the equations of constraint are then solved 
simultaneously to find the function yielding a stationary value. 


Lagrangian Equations - 

In the absence of constraints Lagrange’s equations of motion (Eq. 17.52) 
were found to be 1 


<L 

dt 8q t 8q l 


1 The symbol q is customary in advanced mechanics. It serves to emphasize 
that the variable is not necessarily a cartesian variable (and not necessarily 
a length). 
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with t (time) the one independent variable and q ( (t) (particle position) a set of 
dependent variables. Usually the generalized coordinates q { are chosen to 
eliminate the forces of constraint, but this is not necessary and not always 
desirable. In the presence of constraints (p k Hamilton’s principle is 



and the constrained Lagrangian equations of motion are 


±3L_SL 
dt dq t dq ( 



(17.102) 


(17.103) 


Usually cp k — <£*(<?;,£)> independent of the generalized velocities q In this case 
the coefficient a ik is given by 


a 


d<Pk . 

dq-, 


(17.104) 


If q ; is a length, then a ik X k (no summation) represents the force of the h th con- 
straint in the ^-direction, appearing in Eq. 17.103 in exactly the same way as 
-dVjdq,. 


EXAMPLE 17.7.1 Simple Pendulum 

To illustrate, consider the simple pendulum, a mass m, constrained by a 
wire of length / to swing in an arc (Fig. 17.9). In the absence of the one constraint 



<p a = r-/-0 (17.105) 

there are two generalized coordinates r and 6 (motion in vertical plane). The 
Lagrangian is 


L = T -V 

— jtn(r 2 F r 2 6 2 ) 4- mgr cos 0. 


(17.106) 


taking the potential V to be zero when the pendulum is horizontal, 0 = njl . 
By Eq. 17.103 the equations of motion are 
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d dL dL . m 

d,~Sf --fr’ 1 " K = 

d_ 8L _ 8L = 
dt 80 80 


(17.107) 


or 


-- (mf) — mrO 2 — mg cos 0 — l x , 
dt 


(17.108) 


dt 


(mr 2 6) -P mgr sin 0 = 0. 


Substituting in the equation of constraint (r = /, r = 0), we have 

mlO 2 4- mgcosO — — , 

ml 2 0 + mglsinO = 0. 


(17.109) 


The second equation may be solved for 0(t) to yield simple harmonic motion 
if the amplitude is small (si nO = 0\ whereas the first equation expresses — A 1? 
the tension in the wire in terms of 0 and 0. 

Note that since the equation of constraint, Eq. 17.105, is in the form of 
Eq. 17.95, the Lagrange multiplier l may be (and here is) a function of t (or of 0). 


EXAMPLE 17.7.2 Sliding Off a Log 

Closely related to this is the problem of a particle sliding on a cylindrical 
surface. The object is to find the critical angle 0 C at which the particle flies off 
from the surface. This critical angle is the angle at which the radial force of 
constraint goes to zero (Fig. 17.10). 



FIG. 17.10 A particle sliding on a cylindrical 
surface 


L = T — V = \m(f 2 + r 2 0 2 ) — mgr cos 0 (17.110) 

and the one equation of constraint 

( p l = r _ i = o, (17.111) 

Proceeding as in Example 17.7.1 with = 1, 

mr — mrO 2 + mg cos 0 ~ l x (0\ 
mr 2 0 + 2mrr0 — mgr sin 0 = 0, 


(17.112) 
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in which the constraining force 2 X ( 0 ) is a function of the angle 0. 2 Since r = /, 
r = r = 0, Eq. 17.112 reduces to 



— mlQ 2 + mg cos 0 = 2j(0), 

(17.113a) 


ml 2 0 — mgl sin 6 = 0. 

(17.1131?) 

Differentiating Eq. 

17.113a with respect to time and remembering that 


•stT 

H 

S' ^ 

(17.114) 

we obtain 

— 2 mW — mg sin 0 = . 

d6 

(17.115) 

Using Eq. 17.1132? to eliminate the 6 term and then integrating, we 

have 


2j(0) = 3mgcos0 -h C. 

(17.116) 

Since 

0 

II 

1 

(17.117) 


C = —2 mg. 

(17.118) 


The particle m will stay on the surface as long as the force of constraint is 
nonnegative, that is, as long as the surface has to push outward on the particle 

2(0) = 3mgcos0 — 2 mg > 0. (17.119) 

The critical angle lies where 2(0 C ) = 0, the force of constraint going to zero. 
From Eq. 17.119 

cos 0 C = I, or 0 C = 48°1F (17.120) 

from the vertical. At this angle (neglecting all friction) our particle takes off. 

It must be admitted that this result can be obtained more easily by con- 
sidering a varying centripetal force furnished by the radial component of the 
gravitational force. The example was chosen to illustrate the use of Lagrange’s 
undetermined multiplier without confusing the reader with a complicated 
physical system. 

EXAMPLE 17.7.3 The Schrodinger Wave Equation 

As a final illustration of a constrained minimum, let us find the Euler equa- 
tions for the quantum mechanical problem 

il/*(x,y,z)H\j/(x,y,z)dxdydz — 0, (17.121) 



2 Note carefully that is the radial force exerted by the cylinder on the 
particle. Consideration of the physical problem should show that must 
depend on the angle 6. We permitted X — X(t). Now we are replacing the 
time-dependence by an (unknown) angular dependenc' 
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with the constraint 



i dxdydz — 1. 


(17.122) 


Equation 17.121 is a statement that the energy of the system is stationary, H 
being the quantum mechanical Hamiltonian for a particle of mass m, a dif- 
ferential operator. 


H = 


2m 


V 2 + V(x,y,z). 


(17.123) 


Equation 17.122, the constraint, is the condition that there will be exactly one 
particle present; \j/ is the usual wave function, a dependent variable, and 
its complex conjugate, is treated as a second 3 dependent variable. 

The integrand in Eq. 17.121 involves second derivatives, which can be 
converted to first derivatives by integrating by parts: 






dx = 

dx 2 v a ~ 


dx 


di//* d\jj 
dx dx 


dx. 


(17.124) 


We assume either periodic boundary conditions (as in the Sturm-Liouville 
theory, Chapter 9) or that the volume of integration is so large that \j/ and \j/* 
vanish strongly 4 at the boundary. Then the integrated part vanishes and Eq. 
17.121 may be rewritten as 


(3 


2m 


\ij/* • \ij/ + V\ j/*\j/ 


dx dy dz = 0. 


(17.125) 


The function g of Eq. 17.100 is 


fi ^ 

g = —— \\j/ + Vx//*^ — 2\j/*\lj 

2m 


(17.126) 


again using the subscript x to denote d/dx. For y t = \jj* Eq. 17.101 becomes 

dg d_ dg_ __ d_ _dg_ _ dg 

d\p* dx d\j/* dy d\j/* dz d\jj* 

This yields 

V\j/ — h jf — ~r—(i/y xx + ij/yy + iAzz) = 0 


or 


2m 


VV + Vifj = 2\j/. 


(17.127) 


3 Compare Section 6. 1 . 

4 lim r 1/2 \j/(r) = 0. 
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Reference to Eq. 17.123 enables us to identify X physically as the energy of the 
quantum mechanical system. With this interpretation, Eq. 17.127 is the cele- 
brated Schrodinger wave equation. This variational approach is more than 
just a matter of academic curiosity. It provides a very powerful method of 
obtaining approximate solutions of the wave equation (Rayleigh-Ritz varia- 
tional method, Section 17.8). 


EXERCISES 

1 7.7.1 A particle, mass m, is on a frictionless horizontal surface. It is constrained to move 
so that 0 — cot (rotating radial arm, no friction). With the initial conditions 

t — 0, r = r 0 , r = 0, 

(a) find the radial positions as a function of time. ANS , r(t) = r 0 cosh cot. 

(b) find the force exerted on the particle by the constraint. 

ANS. F ic) = 2 mrco = 2 mr 0 co 2 sinh cot. 

17.7.2 A point mass m is moving over a flat, horizontal, frictionless plane. The mass is 
constrained by a string to move radially inward at a constant rate. Using plane 
polar coordinates (p, cp), p = p 0 — kt 

(a) Set up the Lagrangian. 

(b) Obtain the constrained Lagrange equations. 

(c) Solve the (^-dependent Lagrange equation to obtain co(0, the angular 
velocity. What is the physical significance of the constant of integration 
that you get from your “free” integration? 

(d) Using the co(t) from part (b), solve the p-dependent (constrained Lagrange 
equation to obtain X(t). In other words, explain what is happening to the 
force of constraint as p 0. 

17.7.3 A flexible cable is suspended from two fixed points. The length of the cable is 
fixed. Find the curve that will minimize the total gravitational potential energy 
of the cable. 

ANS. Hyperbolic cosine. 

17.7.4 A fixed volume of water is rotating in a cylinder with constant angular velocity 
co. Find the curve of the water surface that will minimize the total potential 
energy of the water in the combined gravitational-centrifugal force field. 

ANS. Parabola. 

17.7.5 (a) Show that for a fixed-length perimeter the figure with maximum area is a 

circle. 

(b) Show that for a fixed area the curve with minimum perimeter is a circle. 
Hint. The radius of curvature R is given by 

R = (r 2 + rt) 3l2 /(rr ee - 2 rj - r 2 ). 

Note. The problems of this section, variation subject to constraints, are often 
called iso perimetric. The term arose from problems of maximizing area subject 
to a fixed perimeter — as in Exercise 17.7.5(a). 

1 7.7.6 Show that requiring J , given by 

J = 


(P(x)yZ - q{x)y 2 )dx, 



RAYLEIGH-RITZ VARIATIONAL TECHNIQUE 957 


to have a stationary value subject to the normalizing condition 

C b 

y 2 w(x)dx = 1 
Ja 

leads to the Sturm-Liouville equation of Chapter 9 : 

Note. The boundary condition 

py*y\a = o 

is used in Section 9. 1 in establishing the Hermitian property of the operator. 


1 7.7.7 Show that requiring J , given by 

J=[ ) K(x,t)<p(x)(p(t)dxdt, 

Ja Ja 

to have a stationary value subject to the normalizing condition 


*b 

Ja 


<p 2 (x)dx = 1 


leads to the Hilbert-Schmidt integral equation, Eq. 16.89. 
Note. The kernel K(x , t) is symmetric. 


17.8 RAYLEIGH-RITZ VARIATIONAL TECHNIQUE 


Exercise 17.7.6 opens up a relation between the calculus of variations and 
eigenfunction-eigenvalue problems. We may rewrite the expression of Exercise 
17.7.6 as 


= 


Ja 


(. pyl - qy 2 )dx 


ffc 

y 2 wdx 


Ja 


(17.128) 


in which the constraint appears in the denominator as a usual normalizing 
condition. The quantity F , a function of the function y(x), is sometimes called 
a functional. Since the denominator is constant (for normalized functions), the 
stationary values of J correspond to the stationary values of F. Then from 
Exercise 17.7.6 when y(x) is such that J and F take on a stationary value, the 
optimum function y(x) satisfies the Sturm-Liouville equation 


£( p t) + ‘ ,y+iKy - 0 - 


(17.129) 


with X the eigenvalue ( not a Lagrangian multiplier). Integrating the numerator 
of Eq. 17.128 by parts and using the boundary condition , 


py x y\ b a = o> 


we obtain 


(17.130) 




Then substituting in Eq. 17.129, the stationary values of E[y(x)] are given by 

F[y n (x)]=^ (17.132) 

with X n the eigenvalue corresponding to the eigenfunction y n . Equation 17.132 
with F given by either Eq. 17.128 or 17.131 forms the basis of the Rayleigh-Ritz 
method for the computation of eigenfunctions and eigenvalues. 


Ground State Eigenfunction 

Suppose that we seek to compute the ground state eigenfunction y 0 and 
eigenvalue 1 X 0 of some complicated atomic or nuclear system. The classical 
example for which no exact solution exists is the helium atom problem. The 
eigenfunction y 0 is unknown , but we shall assume we can make a pretty good 
guess at an approximate function y, so that mathematically we may write 2 

y = yo+lc t y,. (17.133) 

i= 1 

The c - s are small quantities. (How small depends on how good our guess was.) 
The y- s are normalized eigenfunctions (also unknown), and therefore our trial 
function y is not normalized. 

Substituting the approximate function y into Eq. 17.131 and noting that 

(ni34) 

^0 + Z C f 

F[y(x)-]= ^ . (17.135) 

1+ £c? 

i=i 

Here we have taken the eigenfunctions to be orthogonal — since they are solu- 
tions of the Sturm-Liouville equation, Eq. 17.129. We also assume that y 0 is 
nondegenerate. Now, if we expand the denominator of Eq. 17.135 by the 
binomial theorem and discard terms of order cf , 

fIX*)] = + £ cf(X t - A 0 ). (17.136) 

i=i 

Equation 17.136 contains two important results. 


^his means that 2 0 is the lowest eigenvalue. It is clear from Eq. 17.128 that 
if p(x ) > 0 and q{x) < 0 (compare Table 9. 1), then ^[^(x)] has a lower bound 
and this lower bound is nonnegative. Recall from Section 9. 1 that w(x) > 0. 

2 We are guessing at the form of the function. The normalization is irrelevant. 



RAYLEIGH-RITZ VARIATIONAL TECHNIQUE 959 


1. Whereas the error in the eigenfunction y was 0(c,), 
the error in 2 is 0(cf). Even a poor approximation of 
the eigenfunctions may yield an accurate calculation 
of the eigenvalue. 

2. If X 0 is the lowest eigenvalue (ground state), then since 

— X Q > 0, 

F[y(x)]=A>X 0 , (17.137) 

or our approximation is always on the high side be- 
coming lower, converging on 2 0 as our approximate 
eigenfunction y improves (c, -> 0). Note that Eq. 
17.137 is a direct consequence of Eq. 17.135 inde- 
pendent of our binomial approximation. 


EXAMPLE 17.8.1 Vibrating String 


A vibrating string, clamped at x = 0 and 1, satisfies the eigenvalue equation 


d 2 




(17.138) 


and the boundary condition y( 0) = y(l) = 0. For this simple example the student 
will recognize immediately that y 0 (x) = sin7rx (unnormalized) and X 0 = n 2 . 
But let us try out the Rayleigh-Ritz technique. 

With one eye on the boundary conditions, we try 

y(x) = x( 1 ~x). (17.139) 


Then with p = 1 and w = 1, Eq. 17.128 yields 


F[y(x)] = 


(1 — lx) 2 dx 



x 2 (l — x) 2 dx 


V3_ 

1/30 


= 10 . 


(17.140) 


This result, X = 10, is a fairly good approximation (1.3% error) 3 of X 0 = n 2 ~ 
9.8696. The reader may have noted that y(x), Eq. 17.139, is not normalized 
to unity. The denominator in F[y(x)] compensates for the lack of unit nor- 
malization. 

In the usual scientific calculation the eigenfunction would be improved by 


3 The closeness of the fit may be checked by a Fourier sine expansion (compare 
Exercise 14.2.3 over the half interval [0, 1] or, equivalently, over the interval 
[—1,1], with y(x) taken to be odd). Because of the even symmetry relative 
to x = 1/2, only odd n terms appear : 



sin 7 zx T 


sin37rx 
3 3 + 


sin 57 tjc 



y( X ) = *(1 - X) ~ 
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introducing more terms and adjustable parameters such as 

y = x(l — x) + a 2 x 2 (l — x) 2 . (17.141) 

It is convenient to have the additional terms orthogonal, but it is not necessary. 
The parameter a 2 is adjusted to minimize F[y(x)]. In this case, choosing a 2 — 
1.1353 drives F[y(x)] down to 9.8697, very close to the exact eigenvalue value. 


EXERCISES 

1 7 . 8.1 From Eq. 17.128 develop in detail the argument that X > 0. Explain the circum- 
stances under which X = 0 and illustrate with several examples. 

1 7 . 8.2 An unknown function satisfies the differential equation 


r+tyy-o 

and the boundary conditions 

y(0) = 1. y(l) = 0. 

(a) Calculate the approximation 

^ = F[y,n,i] 

for 

y trial = 1 - X 2 - 

(b) Compare with the exact eigenvalue. 


ANS. (a) X = 2.5 

(b) j = 1.013. 


1 7 . 8.3 In Exercise 17.8.2 use a trial function 

y — 1 — x". 

(a) Find the value of n that will minimize F[y triaI ]. 

(b) Show that the optimum value of n drives the ratio A/A exact down to 1.003. 

ANS. (a) n = 1.7247. 

1 7 . 8.4 A quantum mechanical particle in a sphere (Example 1 1.7.1) satisfies 

V 2 ^ + k 2 if/ = 0, 

with k 2 — 2 mE/h 2 . The boundary condition is that \j/(r = a) = 0, where a is the 
radius of the sphere. For the ground state [where = i p(r)~\ try an approximate 
wave function 

and calculate an approximate eigenvalue k 2 . 

Hint. To determine p(r) and w(r), put your equation in self-adjoint form (in 
spherical polar coordinates). 

ANS. k 2 a = ^ 
a z 

k 2 = — 

^exact ^2 * 
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1 7.8.5 The wave equation for the quantum mechanical oscillator may be written as 

+ (i- x 2 mx) = o, 

dx 1 

with X = 1 for the ground state (Eq. 13.18). Take 

Jl — (x 2 /a 2 ), x 2 < a 2 

^ trial = [0, x 2 > a 2 

for the ground-state wave function (with a 2 an adjustable parameter) and calculate 
the corresponding ground-state energy. How much error do you have? 

Note. Your parabola is really not a very good approximation to a Gaussian 
exponential. What improvements can you suggest? 


1 7.8.6 The Schrodinger equation for a central potential may be written as 

&u(r) + ~ypu(r) = E u(r). 

2 Mr z 

The l{l + 1) term comes from splitting off the angular dependence (Section 2.5). 
Treating this term as a perturbation, use your variational technique to show that 
E > £ 0 , where E 0 is the energy eigenvalue of J?u 0 = E 0 u 0 corresponding to 
/ = 0. This means that the minimum energy state will have / — 0, zero angular 
momentum. 

Hint. You can expand u(r) as u 0 (r) 4- 2 c f u i9 where 5£u t — EiU h E t > E 0 . 


17.8.7 In the matrix eigenvector, eigenvalue equation 

Ar, = X 

where A is an n x n Hermitian matrix. For simplicity, assume that its n real 
eigenvalues (Section 4.6) are distinct, X t being the largest. If r is an approximation 
to r i , 


show that 


n 

r = ri + £ Sit,, 


i — 2 


r f Ar 

r f r 


< 


2i 


and that the error in X t is of the order |<5 f | 2 . Take |<5 f | <<c 1. 

Hint, the n r f form a complete orthogonal set spanning the ^-dimensional 
(complex) space. 


1 7.8.8 The variational solution of Example 17.8.1 may be refined by taking y = x(l — x) 
+ a 2 x 2 ( 1 — x) 2 . Using the numerical quadrature, calculate X ap prox = F[y(x)], 
Eq. 17.128, for a fixed value of a 2 - Vary a 2 to minimize X. Calculate the value of 
a 2 that minimizes X and X itself to five significant figures. Compare your eigenvalue 
X with n 2 . 
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APPENDIX 1 
REAL ZEROS OF A 
FUNCTION 


The demand for the values of the real zeros of a function occurs frequently in 
mathematical physics. Examples include the boundary conditions on the solu- 
tion of a coaxial wave guide problem, Example 11.3.1, eigenvalue problems in 
quantum mechanics such as the deuteron with a square well potential, Example 
9.1.2, and the location of the evaluation points in Gaussian quadrature (Appen- 
dix 2). 

The IBM Scientific Subroutine Package (SSP) offers three subroutines for 
determining the real zeros of functions. These are (1) RTWI, an iteration tech- 
nique due to Wegstein, (2) RTMI, Mueller’s bisection iteration technique and 
(3) RTNI, Newton’s method, hallowed in introductory calculus. All three 
methods require close initial guesses of the zero or root. How close depends on 
how wildly your function is varying and what accuracy you demand. All are 
methods for refining a good initial value. To obtain the good initial value and 
to locate pathological features that must be avoided (such as discontinuities or 
singularities), you should make a reasonably detailed graph of the function. 
There is no real substitute for a graph. Exercise 11.3.12 emphasizes this point. 

Newton's Method 

This is commonly presented in differential calculus because it illustrates 
differential calculus. It may sometimes be a good method — if you know exactly 
what your function is doing. 

Newton’s method assumes the function f(x) to have a continuous first deriva- 
tive. From the geometrical interpretation of a derivative as the tangent to the 
curve, Fig. 1, 

= -/'(* o) (Al.l) 

or 

x ‘- x °~rH- (AU) 

With x 0 as the initial guess, calculate x 1 from Eq. A1.2). Iterating, from x 1 you 
calculate x 2 and hopefully converge rapidly on the root. 

Newton’s method does require computation of the derivative. This may or 

963 
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FIG. 2 Newton’s method — local minimum, no convergence 

may not be a handicap. Calculation of the derivative in Exercise 11.3.12 would 
be messy. But the real objection to Newton’s method is that it is extremely 
treacherous. It may fail to converge, oscillating in the vicinity of a local maxi- 
mum or minimum (Fig. 2), or it may diverge in the vicinity of an inflection point. 
Or, if your initial guess is not close enough, Newton’s method may converge to 
the wrong root. Unless you know exactly what your function is doing, this is a 
method to avoid. 

Bisection Method 

This method assumes that only f(x) is continuous . It requires that initial values 
x t and x r straddle the zero being sought. Thus f(x t ) and f(x r ) will have opposite 
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FIG. 3 Bisection root-finding method 

signs, making the product/(x*) • f(x r ) negative. In the simplest form of the bisec- 
tion method, take the midpoint x m = j(x t + x r ) and test to see which interval 
[xj, xj or [x m , x r ] contains the zero. The easiest test is to see if one product, say, 
/(xj •f(x r ) < 0. If this product is negative, then the root is in the upper half 
interval [x m , x J, if positive, then the root must be in the lower half interval 
[xj, x m J. Remember, we are assuming /(x) to be continuous. The interval con- 
taining the zero is relabeled [x z ,x r ] and the bisecting continues (as in Fig. 3) 
until the root is located to the desired degree of accuracy. Of course, the better 
the initial choice of x t and x,. is, the fewer will be the bisections required. How- 
ever, as explained subsequently, it is important to specify the maximum number 
of bisections that will be permitted. 

This bisection technique may not have the elegance of Newton’s method, but 
it is reasonably fast and much more reliable — almost foolproof if you avoid 
discontinuous functions, such as /(x) = l/(x — a ), shown in Fig. 4. Again, there 
is no substitute for knowing the detailed local behavior of your function in the 
vicinity of your supposed root. 

In general, the bisection method (RTMI) is recommended. 

Two Warnings 

1. Since the computer carries only a finite number of 
significant figures we cannot expect to calculate a 
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FIG. 4 A simple pole, /(A,) * /(x r ) < 0 but no root 

zero with infinite precision. It is necessary to specify 
some tolerance. All three SSP subroutines RTWI, 
RTMI, and RTNI require that some tolerance be 
specified (input parameter EPS). When the root is 
located to within this tolerance the subroutine 
returns control to the main calling program. 

2. All the approaches mentioned here are iteration 
techniques. How many times do you iterate? How 
do you decide to stop? It is possible to program the 
iteration so that it continues until the desired ac- 
curacy is obtained. The danger is that some factor 
may prevent reasonable convergence. Then your 
tolerance is never achieved and you have an infinite 
loop. It is far safer to specify in advance a maximum 
number of iterations. Again, this is the approach of 
all three SSP subroutines. (Input parameter IEND). 
Thus these subroutines will stop when either a zero 
is determined to within your specified tolerance or 
the number of iterations reaches your specified 
maximum — whichever occurs first. With a simple 
bisection technique the selection of a number of 
iterations depends on the initial spread — x t and 
on the precision you demand. Each iteration will 
cut the range by a factor of 2. Since 2 10 = 1024 % 
10 3 , 10 iterations should add 3 significant figures, 
20 should add 6 significant figures to the location of 
the root. 
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EXERCISES 

1 .1 Given f(x) = x — ax 3 . How small must |x 0 | be for Newton’s method to converge to 
x = 0? 

1 .2 Try Newton’s method (RTNI or your own program) to locate a root of the following 
functions 

(a) f{x) = x 2 + 1, and x 0 ~ 0.9, 1.0 

(b) f(x) = (x 2 + 1) 1/2 , x 0 - 0.9, 1.0 

(c) f(x) — sinx, x 0 = 1.0, 1.1, 1.2 

(d) /(x) — tanhx, x 0 = 0.9, 1.0, 1.1. 

RTNI demands that you write a subroutine to supply RTNI with /(x) and its deriva- 
tive. Write out x and f(x) everytime the subroutine is called, so that you can trace the 
sequence of extrapolations. 

1.3 As an example of what Newton’s method can do, call RTNI to find the largest root 
of the Chebyshev polynomial T 10 (x). Try a succession of initial values x = 0.95, 0.96, 
0.97, and 0.98. Explain in detail what has happened. 

Note. RTNI demands a subprogram that will supply the function (T 10 (x)) and its 
derivative. SSP subroutine CNP will provide X 10 (x) and lower index T’s. T/ 0 (x ) may 
be calculated from Eq. 13.77 (x =/= ±1). 

ANS. maximum root = 0.98769. 

1 .4 Write a simple bisection root determination subroutine that will determine a simple 
real root once you have straddled it. Test your subroutine by determining the roots 
of one or more polynomials or elementary transcendental functions. 

1 .5 The theory of free radial oscillations of a homogeneous earth leads to an equation 


The parameter a depends on the velocities of the primary and secondary waves. For 
a — 1.0, find the first three positive roots of this equation. 

ANS. x t = 2.7437 x 2 == 6.1168 x 3 = 9.3166. 

1.6 (a) Using the Bessel function J 0 (x) generated by SSP subroutine BESJ, locate 
consecutive roots of y 0 (x): a * and a «+i f° r w = 5, 10, 15, . . . , 30. Tabulate 
a n+1 , (a„ +1 — a„) and (a n+1 — ccj/n. Note how this last ratio is approaching unity. 
Hint. RTMI will pinpoint the root once you have straddled it. 

(b) Compare your values of a„ with values calculated from McMahon’s expansion, 
AMS-55, Eq. 9.5.12. 
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APPENDIX 2 

GAUSSIAN 

QUADRATURE 


Interpolatory Formulas 

The problem is to find the numerical value of a definite integral 


/(x)w(x) dx. 


We approximate our integral by a finite sum 

f f(x)w(x)dx w X A kf( x k)- 
Ja *=1 

The sum in Eq. A2.1 contains 2n + 1 parameters: 

n x k s, points for evaluating /(x) 
n A k s, coefficients 


(A2.1) 


1 the choice of n itself. 

We proceed by replacing/(x) by an interpolating polynomial P(x) of degree 
rc — 1 and a remainder term: 


fix) = jP(x) + r(x). 

P(x) is fitted to f{x) at the n [F(x fc ) = /(x fc )] by the choice 

P(x) = £ . g( * } r /(x fc ), 

t=i (x - x*)a (x^ 

where a(x) is a completely factored nth-degree polynomial, 
a(x) = (x - XjMx - x 2 ) • • • (x - x„ ). 


Note that 


lim = 1. 

*->x k (x — x k )a (x k ) 


(A2.2) 


(A2.3) 


(A2.4) 


(A2.5) 


For f(x) a polynomial of degree n — 1 the remainder term r(x) is zero and 
Eq. A2.3 becomes an identity. Specifically (using Eq. A2.5), P(x k ) = f(x k ), the 
(n — 1 (-degree polynomial is fitted to /(x) at nx k . 


968 
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When the integral of the remainder term is small 

f 6 f b 

f(x)w(x)dx » P(x)w(x)dx 


(A2.6) 


w(x) dx 


using Eq. A2.3. Interchanging summation and integration, we obtain 


't 

Ja 


f(x)w(x) dx * £ ){x k ) 


rb 


fc = 1 


a(x) 


(x - x t )a'(xj 


w(x) dx 


(A2.7) 


= Z A/W- 

fc = l 


Quadrature formulas of this type are labeled interpolator y. Since every 
polynomial /(x) of degree n — 1 may be represented exactly [r(x) = 0] by our 
rc-point-fit interpolating polynomial P(x), Eq. A2.7 is exact for such polynomial 
functions, /(x). 

The locations of the x k> the zeros of a(x) in Eq. A2.7 have not been specified. 
Taking them to be equally spaced leads to the various Newton-Cotes formulas. 
Of these Simpson’s rule (Eq. A2;8) is probably the best known and, among the 
simpler formulas, it is the most accurate. 

f(x) dx * \{f{a) + 4 f(a + h) + 2 If (a + 2 h) + 4 f(a + 3 h) 

Ja 3 (A2.8) 

+ 2f(a + 4h) + ■■■ + 4/(6 - h) + f{b)}. 


Here h is the distance between the equally spaced points, h = x 2 — x 1 = x 3 - x 2 , 
and so on. Equation A2.8 may be considered a sum of three-point fits 

C c+2h h 

J /(x) dx « ^ {/(c) + 4 f(c + h) + f(c + 2/z)}, (A2.9) 

which is expected to be exact if f(x) is of degree <2 over the interval [c, c + 2 h]. 

Actually Simpson’s rule is better than this. An analysis of the error shows 
that the error in Simpson’s rule is given by — h 5 / (4) (£)/ 90 where £ is a point in 
[c,c + 2h], For /(x) = x 3 , / (4) (x) = 0 and Simpson’s rule is exact for cubic 
equations. The reader may verify this by showing that Jo x 3 dx is given exactly 
by Eq. A2.8. 

This result may be interpreted as a consequence of symmetry principles: 
(1) the coefficients in Simpson’s rule are symmetric with respect to the middle 
x k ; 1, 4, 1 for Eq. A2.9. (2) For Simpson’s rule n = 3, odd and x 3 is an odd 
function. If we set c = — h, c + h = 0, then both sides of Eq. A2.9 vanish — by 
(anti) symmetry. This additional degree of precision appears for each of the 
Newton-Coles formulas where n is odd. 


Gaussian Quadrature 

It was pointed out by Gauss that the locations of x k represent unused para- 
meters that may be used to improve the accuracy of Eq. A2.7, that greater 
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precision can be obtained if the zeros of a(x) are not equally spaced but are 
chosen as follows. 

Take the x k so that our completely factored nth-degree polynomial a(x) is 
the nth-degree polynomial which is orthogonal to all lower degree polynomials 
over \_a,b~\ with respect to the weighting factor w(x). The most frequently 
encountered combinations of interval and weighting factor are those in Table 
9.3. 1 The x fc ’s therefore are the n zeros of the nth-degree polynomials — Legendre, 
Hermite, Laguerre, Chebyshev, and so on. Both the x fc ’s and the corresponding 
coefficients A k are tabulated in AMS-55, Chapter 25. Computing subroutines 
exist in both single and double precision for the Legendre, Laguerre, and 
Hermite cases. 

We shall prove that this choice of x k (zeros of the appropriate nth-degree 
orthogonal polynomial) makes the quadrature formula A2.7 exact for f(x) 
a polynomial of degree < 2n — 1. Here is the power of this Gaussian choice. 
(Taking the x k equally spaced (Newton-Cotes) is exact only for f(x) a poly- 
nomial of degree < n — 1, n even or < n, n odd.) 

Proofs of the necessity and sufficiency of this choice of orthogonal poly- 
nomial roots follow. 

Theorem A necessary and sufficient condition that an interpolator 
formula of the form of Eq. A2.7 be exact for all polynomials of degree < 2n — 1 
is that a(x) be orthogonal with respect to w(x) over the interval [a, 6] to all 
polynomials of degree < n — 1. 

Necessity. Assume Eq. A2.7 is exact for /(x) any polynomial of degree <2 n — 
1. Let Qi(x) be any polynomial of degree < n — 1. Then/(x) = a(x)Q x (x) is a 
polynomial of degree < 2n — 1. By simple substitution, we have 

f f(x)w(x)dx = f oc (x)Q 1 (x)w(x)dx, (A2.10a) 

Ja Ja 

and since Eq. A2.7 is assumed exact for this degree polynomial integrand, 


f 


a(x)Q 1 (x)w(x)dx = £ A a (**)6i(*ic) 

k = 1 

= 0 . 


(A2.106) 


The final = 0 follows because ot{x k ) = 0, Eq. 2.4. But this is a statement that our 
nth-degree polynomial w(x) is orthogonal to all polynomials Q x {x) of degree 

< n — 1. 

Sufficiency. Assume the orthogonality of a(x) to all polynomials of degree 

< n — 1. Let /(x) be a polynomial of degree < 2n — 1. Dividing /(x) by a(x), 
we obtain 


fix) 

a(x) 


= Qi M + 


pjx) 

oc(x) 


(A2.ll) 


1 If a and b are finite, the interval [ a , fr] can always be transformed to [— 1, 1] 
by the linear transformation t — [ 2x ~ (a + b)2/(a — b), x = \(b — a)t + 
(b + a)]/2. Then j‘/0) dx = Ji, /(r) dt. 
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or 


/(*) = ol(x)Q 2 (x) H- p(x\ (A2.12) 

with Q 2 (x) and p(x) polynomials of degree < n — 1. Integrating yields 


f f(x)w(x)dx =?= f a(x)g 2 ( x ) vy ( x )^ x + I p(x)w(x)dx. (A2.13) 

J a J a J a 

The first integral on the right vanishes because of our postulated orthogonality. 
Then, because the degree of p(x) is < n — 1, Eq. 2.7 (which is interpolatory) 
is exact and we have 


f 


f 


f(x)w(x)dx = X A k p(x k ). 
= 1 

Since a(x fe ) = 0, Eq. A2.12 yields 


(A2.14) 


p(x k ) = f(x k ). 


Therefore 


f f(x)w(x)dx = ^ (All 5) 

Ja ‘=i 

exact. This is Eq. A2.7, exact for /(x), any polynomial of degree < 2n — 1. 

As a specific example of Eq. A2.15, consider the case where [u,f>] = [ — 1,1] 
with w(x) — 1. The polynomials orthogonal over this interval with respect to 
this weighting function are the Legendre polynomials of Chapter 12. For the 
choice n — 10 the x k are the 10 roots of P 10 (x). The values of A k are given in 
principle by Eq. 2.7. A more convenient expression is derived by Krylov 2 . 
Finally, with the numerical values of A k and x k , Eq. A2.15 becomes 

j' f(x)dx = +0.06667134/(4-0.97390652) 

+0.14945134 /(+ 0.8650 6336) 

+ 0.21908636 /(+ 0.6794 0956) 

+ 0.2692 667 1 /( + 0.4333 9539) 

+ 0.29552422 /(+ 0.1488 7433) 

(A2.16) 

+ 0.2955 2422 /(- 0. 1488 7433) 

+ 0.2692 6671 /( - 0.4333 9539) 

+ 0.2190 8636 /( - 0.6794 0956) 

+ 0. 1494 5 134/(- 0.8650 6336) 

+ 0.0666 7 1 34 /( -0.9739 0652), 

exact (to the number of digits listed) for/(x) a polynomial of degree < 19. 


2 Tabulations of the A h and x k are found in the references that follow and in 
AMS-55 (Chapter 25). 
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The actual usefulness of Gaussian integration is contingent upon two factors 
(1) the availability of computers and (2) the availability of the values of f(x) 
at x = x k . This generally means that f(x) be expressed in closed form or approx- 
imated in some convenient form so that f(x k ) may readily be calculated. If f(x) 
is given only as equally spaced tabulated values, Simpson’s rule is probably 
the best choice for the numerical integration. 

Warning. Our fundamental assumption is that f(x) can be accurately repre- 
sented by a (2 n — l)-degree polynomial with n reasonably small. If f(x) has a 
singularity in the integration interval, this assumption of a polynomial repre- 
sentation is obviously not valid. Even if f{x) remains finite, the presence of an 
infinite slope means our assumption is poor and that numerical accuracy will 
be relatively low. Exercise A2.7 illustrates these points. 


EXERCISES 


2.1 (a) Verify Eq. A2.5. 

(b) With P(x) a polynomial of degree < n - 1 and a(x) given by Eq. A2.4, verify that 


P(x) = I 

k = 1 


< — 7V f iXk> - 

(x - x k )a (x k ) 


2.2 Using a 10-point Gauss- Legendre subroutine, evaluate 

*i 

x n dx for n = 0(1)40. 

Jo 

Tabulate the computed value of the integral, the exact value, and the relative error. 
Plot log (relative error) versus n. 


2.3 Using a 10-point Gauss-Laguerre subroutine, evaluate 

fee 

x n e~ x dx for n — 0(1)25. 

Tabulate the computed value of the integral, the exact value, and the relative error. 
Plot log (relative error) versus n. 


2.4 Using a 10-point Gauss-Hermite subroutine, evaluate 

x n e~ x2 dx for n = 0(2)22. 


fee 

•J 


Tabulate the computed value of the integral, the exact value, and the relative error. 
Plot log (relative error) versus n. 


2.5 (a) Write a double precision Gauss-Chebyshev subroutine that will evaluate 
integrals of the form 


(b) 


I' 1 fix) 

L, (i - x 2 ) 112 


dx 


using 20 points, the 20 roots of the Chebyshev polynomial, T 20 (x). These roots 
and the coefficients A k are tabulated by Stroud and Secrest. 

Check your subroutine by using it to compute 
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J x 2 ”(l — x 2 ) 1/2 dx 

for n = 0(2)30. Tabulate the computed value of the integral, the exact value, and 
the relative error. Plot log (relative error) versus n. 


2.6 Evaluate 


4 


Jo 


dx 

1 + x 2 ’ 


using Gauss-Legendre quadrature. How many evaluation points are needed to 
obtain a result accurate to 5 significant figures? to 12 significant figures? 

AN S. 4 point Gauss- Laguerre quadrature => 5 significant figures 

12 points =>12 significant figures 


2.7 From Exercise 10.2.11 the Euler-Mascheroni constant y may be written as 


1 . 


y = 



In re r dr 


(a) 

(b) 


2. y = 1.0 - 


rlnre r dr 


[32 points => 3 significant figures] 


3. y 


I' 00 

= 1.5 -0.5 


r 2 lnre r dr. 


Explain why Gauss-Laguerre quadrature should not be attempted on the first 
integral. 

Evaluate (2) and (3) using a 32-point Gauss-Laguerre quadrature and explain 
the very limited accuracy of your results. 


2.8 (a) 


(b) 


Evaluate the integral 


I = 


^ e~ x2 dx 


1 + x 


using Gauss-Hermite quadrature formulas for several values of n (number of 
evaluation points). 

Rewrite the integral as 


1 — 2 


1 +x 


x dx. 


and evaluate by Gauss-Laguerre quadrature for several values of n. 

ANS. (b) 1.2103. 
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Abel’s equation, 875, 878 
Addition theorem 
Bessel functions, 585 
Legendre polynomial, spherical harmonic, 

261, 693-698, 913 
Adjoint operator, 498 
Analytic continuation, 378-380 
Bromwich integral, 855 
factorial reflection relation, 413, 542 
gamma function, 545 
Analytic functions, 362 
Cauchy integral formula and, 371-376 
Cauchy integral theorem and, 365-371 
conformal mapping and, 392-394 
Angular momentum operator, 42, 46, 108, 109, 
451, 684-692 

associated Legendre equation, 451 
vector spherical harmonics, 711 
Anomalous dispersion, 857-859 
Antisymmetry, determinants, 170 
Associated Laguerre equation, polynomials. See 
Laguerre equation; Laguerre functions 
Associated Legendre equation, functions. See Leg- 
endre equation; Legendre functions 
Asymptotic series, 339-346 
applications to computing, 344 
Bessel functions, 617-620 
confluent hypergeometric functions, 757 
cosine, sine integrals, 342 
incomplete gamma function, 339 
integral representation expansion, 339-343, 
616-618, 757 

steepest descent, 431-436 
Stokes’ method, 622 
Axial vector. See Pseudovector 


Bernoulli functions, 330 
Bernoulli numbers, 327-330, 350, 413, 414, 
555, 775 

Bessel equation, 114, 577 
Laplace transform solution, 842-844 
self-adjoint form, 500 
series solution, 459, 474 
singularities, 453 
spherical, 116, 622 
Bessel functions, 573-636 
asymptotic expansion, 616-622 
Bessel series, 592 

confluent hypergeometric representation, 755, 
756 


cylindrical waveguide, 600 
first kind, 573-591 
Fourier transform, 805, 806, 847 
generating function, 573, 575, 585 
Hankel functions. See Hankel functions 
integral representation, 578-580, 588 
Laplace transform, 843, 846 
modified. See Modified Bessel functions 
nonintegral order, 584 
orthogonality, 591-596 
recurrence relations, 576 
second kind. See Neumann functions 
series form, 459, 575 
spherical, 622-636 
asymptotic forms, 627 
definitions, 623 
orthogonality, 628, 629 
recurrence relations, 627 
spherical modified, 633, 634 
Wronskian relation, 631, 634 
Wronskian formulas, 599, 602, 605, 608, 622 
zeros, 581 

Bessel inequality, 526, 533 
Bessel integral, 588 
Beta function, 560-565 
incomplete, 292, 319, 562, 752 
Laplace convolution, 853 
Binomial theorem, 307 
Biot and Savart law, 32, 674 
Bisection (root finding), 964 
Boltzmann equation, 866 
Bom approximation, 916, 923 
Bose-Einstein statistics, 950 
Boundary conditions, 502 
hollow cylinder, 592, 614 
integral equations and, 871, 899, 902 
magnetic field of current loop, 672-676 
ring of charge, 658 
sphere in uniform electric field, 656 
Sturm-Liouville theory, 503 
waveguide, coaxial cable, 600 
Branch points, 397 
Bromwich integral, 419, 853-861 


Calculus of residues, 396-421 
evaluation of definite integrals, 403-421, 856 
Jordan’s lemma, 408 
Calculus of variations, 925-974 
constraints, 945, 950 
Euler equation, 928 
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Hamilton’s principle, 938 
Hilbert-Schmidt integral equation, 957 
integral equations, applications to, 887 
Lagrangian equations, 939, 951 
Lagrangian multipliers, 945-950 
Rayleigh-Ritz variational technique, 957-961 
soap films, 931-937 
Sturm-Liouville equation, 957 
surface of revolution, 931 
Catalan’s constant, 292, 320, 334, 551 
Cauchy convergence tests, 281-283 
Cauchy principal value, 401, 422, 490, 914 
Cauchy-Riemann conditions, 360-365, 429 
fluid flow, 47, 364 
Laplace’s equation, 363 
polar coordinates, 364 
Cauchy’s integral formula, 371-376 
calculus of residues, 374 
derivatives of analytic functions, 373 
Cauchy’s integral theorem, 365-371 
Cauchy-Goursat proof, 368, 369 
Causality, 425, 822 
Cayley-Klein parameters, 10, 253 
Cavity, cylindrical resonant, 582, 590 
Chebyshev equations, 735 
convergence of series solution, 292 
self-adjoint form, 500 
singularities, 454 
Chebyshev functions, 731-748 
discrete orthogonality, 792 
Fourier transform, 807 
generating functions, 731, 737 
Gram-Schmidt construction, 522, 523 
hypergeometric representations, 751 
orthogonality, 737 
recurrence relations, 732 
series of 

numerical applications, 740-748 
truncation, telescoping, 744-746 
shifted, 746 

trigonometric form, 736, 741 
Christoffel symbol, 160, 162 
Circular cylindrical coordinates, 95-101 
Circular membrane, Bessel functions, 589 
Clausen functions, 783 
Closure, 529, 536 
Bessel functions, 594, 635 
spherical harmonics, 684 
Completeness 

eigenfunctions of Hilbert-Schmidt integral 
equation, 893 
Fourier series, 761 

Sturm-Liouville eigenfunctions, 523-538 
Complex variables, 352-395, 396-436 
calculus of residues, 396-421 
Cauchy-Riemann conditions, 360-365 
Cauchy’s integral formula, 371-376 
Cauchy’s integral theorem, 365-371 
complex algebra, 353-360 
contour integrals, 365, 919 
mapping, 384-392 
conformal, 392-394 


Confluent hypergeometric equation, 753 
second solution, 753 
singularities, 454 

Confluent hypergeometric functions, 753-758 
asymptotic expansions, 757 
Bessel functions, 755, 756 
Hermite functions, 755 
Laguerre functions, 755 
Whittaker functions, 756 
Wronskian relation, 757 
Conformal mapping, 392-394 
Continuity equation, 40 
Contraction of tensors, 124 
Contravariant tensor, 120 
Convergence, 280-293. See also Infinite series 
analytic continuation and, 378-380 
improvement of, 288,296,334 
series solution of 
Chebyshev equation, 292 
Legendre equation, 288, 291, 464 
ultraspherical equation, 292 
Convolution theorem 
Fourier transforms , 810-814 

Laplace transforms, 849-853, 860 
Coordinate system. See specific coordinate system 
Cosine x, infinite product representation, 348 
Cosine integral, 565 
asymptotic expansion, 342, 343 
confluent hypergeometric representation, 756 
Covariant differentiation, 161 
Covariant tensor, 120 

Cross product of vectors. See Vector product of 
vectors 

Crossing conditions, 423 
Curl 

coordinates, cartesian, 42-47 
circular cylindrical, 97 
curvilinear, 92 
spherical polar, 104 
integral definition, 55 
irrotational, 44, 49, 67, 79, 150 
tensor, 166 

Curvilinear coordinates, 86-90 
differential vector operations, 90-94 
metric, 87 
scale factors, 87 


D’Alembertian, 126, 152, 437 
Degeneracy, eigenvalues, 220, 223, 228, 513 
Del, 33 

successive applications, 47-51 
Delta function, Dirac, 81, 481^184 
Bessel representation, 797 
eigenfuriction expansion, 528, 684 
Fourier integral, 799, 800 
Green’s function and, 485, 905, 909 
impulse force, 835, 836 
Laplace transform, 834 
point source, 905, 909 
quantum theory, 816 


976 INDEX 



sequences, 483, 484, 488, 490, 780, 804, 805 
sine, cosine representations, 805 
spherical polar coordinates, 484 
theory of distributions, 483,484 
De Moivre’s formula, 356 
Deuteron, eigenfunction-eigenvalue, 500-502, 
819 

Descending power series solution, 320 
Determinants, 168-176 
antisymmetry, 170 

Laplacian development by minors, 169 
representation of a vector product, 21, 93, 97, 
104 

secular equation, 221 
solution of set 

of homogeneous equations, 171, 222, 884 
of nonhomogeneous equations, 172 
Gauss elimination, 172 
Gauss-Jordan elimination, 173 
Differentian equations, 437-496. See also spe- 
cific differential equation 
eigenfunctions, eigenvalues, 499 
first order, 440-447 
exact, 441 
linear, 442 
separable, 440 
Fuchs’s theorem, 462, 472 
nonhomogeneous 

Green’s function solution, 480-491 
particular solution, 455, 479 
numerical solutions, 491-496 
second order, absence of third solution, 477 
second solution, 467-480, 507 
logarithmic term, 473 
self-adjoint, 497-509 

separation of variables, 111-117, 440, 448-451 
series solution (Frobenius), 454—467 
singular points, 451-454 
Diffusion equation, solutions, 450 
Dipoles. See also Electric dipole; Magnetic dipole 
interaction energy, 18 
radiation fields, 110 

Dirac delta function. See Delta function; Dirac 
Dirac matrices, 211-213 
Direct product 
matrices, 179 
tensors, 124 

Direction cosines, 4, 191. See also Matrices, 
orthogonal 
identities, 22 

orthogonality condition, 11, 194, 195 
Dirichlet integral, 563 
Dirichlet kernel, 482 
Dispersion, anomalous, 857-859 
Dispersion theory, 421-428, 803 
crossing relations, 423 
Hilbert transform, 423 
sum rules, 424 
symmetry, 423 
Divergence 

coordinates, cartesian, 37-42 
circular cylindrical, 97 


curvilinear, 90 
spherical polar, 104 
integral definition, 55 
solenoidal, 41, 49, 79, 150 
tensor, 164 

Dot product of vectors. See Scalar product of 
vectors 

Dual tensor, 132, 157. See also Pseudotensor 
Duplication formula for factorial functions. See 
Legendre duplication formula 
Dyadics, 137-140 


Eigenfunctions, 499 
completeness of, 523-538, 893 
degeneracy, 513 

expansion of Dirac delta function, 528 
of Green’s function, 529 
of square wave, 512 
Hermitian differential operators, 511 
integral equations, 891 
orthogonality, 511 
variational calculation, 958 
Eigenvalues, 499, 892 
Hermitian differential operators, 510 
Hermitian matrices, 219 
Hilbert-Schmidt integral equations, 892 
normal matrices, 229 
real, 219, 892 
variational principle for, 958 
Eigenvectors 

Hermitian matrices, 219 
normal matrices, 229 
Eight-fold way, 270 
Einstein velocity addition law, 157, 275 
Elasticity, 140—150 
cubic symmetry, 149 
Hooke’s law, 146, 148 
isotropic solid, 149 
stress, 142 
strain, 140 

Electric dipole, 47, 641 
Legendre expansion, 641, 676 
Electromagnetic invariants, 155, 157 
Elliptic integrals, 321-327 
first kind, 322 

hypergeometric representations, 749 
second kind, 323 
Error integrals, 568 
asymptotic expansion, 345 
confluent hypergeometric representation, 756 
Essential singularity, 396, 400, 452 
Euler angles, 10, 198-200, 204, 256-259 
Euler equation, 928 
Euler identity, 350 

Euler-Maclaurin integration formula, 330-332, 
555 

Euler-Mascheroni constant, 284, 291, 310, 338, 
346, 550, 571, 860 

Exponential integral function, 339-341, 566 
Laplace transform, 847 
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Factorial function, 433, 539-572. See also 
Gamma function 
complex argument, 548 
contour integrals, 545 
digamma function, 549-556, 597 
double factorial notation, 292, 544 
infinite product, 541 
integral representation, 540, 545 
Legendre duplication formula, 556, 561 
Maclaurin expansion, 551 
polygamma functions, 550 
reflection relation, 544 
contour integrals, 412, 413, 418 
infinite products, 348,349 
relation to gamma function, 543 
steepest descent asymptotic formula, 433 
Stirling’s series, 555-560 
Fast Fourier transform, 791 
Fermat’s principle, .936 
Fermi age equation, 809 
Fermi-Dirac statistics, 950 
Force-potential relation, 36, 64-69 
Fourier-Bessel series, 592 
Fourier-Mellin integral, 854 
Fourier series, 760-793 
advantages, 766-769 
completeness, 761 
differentiation, 779 
Gibbs phenomenon, 783-787 
integration, 778 
interval, change of, 768 
orthogonality, 512, 761 
square wave, 512, 770, 784 
Sturm-Liouville theory, 512, 762 
summation of, 763 
uniform convergence, 303 
Fourier transform, 794-823 
aliasing, 790 

convolution theorem , 810-814 
delta function derivation, 799 
discrete transform, 787-792 
fast Fourier transform, 791 
finite wave train, 801-803 
Fourier integral, 797-799 
inversion theorem, 800-807 
momentum representation, 814-820 
solution of integral equation, 875 
transfer functions, 820-823 
transform of derivatives, 807-810 
Fraunhofer diffraction, Bessel functions, 580 
Fredholm integral equation, 865. See also Inte- 
gral equations 

Fresnel integrals, 419, 632, 756, 807 
Frobenius’ method. See Series solution of differen- 
tial equations 

Fuchs’s theorem, 462, 472 


Gamma function, 433, 539-572. See also Fac- 
torial function 

complex argument, 548, 551 


definite integral (Euler) definition, 540 
digamma function, 549-555 
infinite limit (Euler) definition, 539 
infinite product (Weierstrass) definition, 541 
poly gamma functions, 550 
recurrence relation, 539, 546 
reflection identity, 542 
Gauge transformation, 74, 156 
Gauss’ differential equation. See Hypergeometric 
differential equation 

Gauss’ error function , asymptotic expansion , 345 

Gauss’ law, 74-77, 485 
two dimensional case, 78 
Gauss’ theorem, 57-61, 75, 90 
dyadics, 139 

Gaussian quadrature, 968-974. See also 
Quadrature 

Gegenbauer polynomials. See Ultraspherical 
polynomials 
Generating function, for 
Bernoulli numbers, 327 
Bessel functions, 416, 573, 575, 585 
modified Bessel functions, 416, 613 
Chebyshev polynomials, 416, 731 
Hermite polynomials, 416, 712 
Laguerre polynomials, 416, 723 
associated Laguerre polynomials, 725 
Legendre polynomials, 416, 637 
associated Legendre functions, 668 
ultraspherical polynomials, 731 
Generators, group, 261-267 
Gibbs phenomenon, 783-787 
Gradient 

constrained derivative, 949 
coordinates, cartesian, 33-37 
circular cylindrical, 97 
curvilinear, 90 
spherical polar, 104 
integral definition, 55 
force-potential relationship, 64-69 
Gram-Schmidt orthogonalization, 516-523 
Green’s functions, 897-923 
construction of 
one dimension, 898-901 
two, three dimensions, 910, 911 
delta function, 905,909 
eigenfunction expansion, 529 
electrostatic analog, 480, 897 
Helmholtz equation, 529, 908, 912 
integral equation-differential equation 
equivalence, 901-904 
Laplace operator, 910-912 
circular cylindrical expansion, 914 
spherical polar expansion, 911-913 
modified Besse'l function, 915 
modified Helmholtz equation, 908, 912 
Poisson V equation, 485, 897 
spherical Bessel functions, 921 
symmetry property, 486, 901 
Green’s theorem, 58, 79 
Group theory, 237-275 
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character, 241 
continuous groups, 251-275 
generator, 261-267 
homomorphism, SU(2)-0£ 255-258 
Lorentz group, 271-275 
orthogonal group, O^; 252 

rotation matrix, 258, 259 
special unitary group, SU(2), 253 

definitions, 238 
discrete groups, 243-251 
dihedral groups, 244 
irreducible representations, 240 
isomorphism, 239, 242 
permutation groups, 249 
vierergruppe, 239, 245 


Hadamard product, 205 

Hamilton’s principle and Lagrange equations of 
motion, 938-940 
Hankel functions, 603-610 
asymptotic forms, 431, 618, 619 
integral representations, 606-610 
series expansion, 604 
spherical, 623 

Wronskian formulas, 605, 608 
Hankel transforms, 795-797 
Harmonics. See also Spherical harmonics 
sectoral, tesseral, zonal, 685 
tensor spherical, 140 
vector spherical, 707-711 
Heaviside expansion theorem, 400, 861 
Heaviside shifting theorem, 840 
Heaviside unit step function, 484, 490. See also 
Step function 

Helmholtz equation, 85, 437, 610, 622, 666 
Green’s function, 490, 908, 912 
solutions, 450 
Helmholtz theorem, 78-84 
Hermite equation, 714 
convergence of series solution, 464 
self-adjoint form, 500 
singularities, 454 
Hermite functions, 712-721 
confluent hypergeometric representation, 755 
generating function, 712 
Gram-Schmidt construction, 522 
orthogonality, 714 
recurrence relations, 712 
Hermitian differential operator, 504, 510-516 
completeness of eigenfunctions, 523-538 
eigenfunctions, orthogonal, 511 
eigenvalues, real, 510 
integration interval, 504 
in quantum mechanics, 505 
Hermitian matrices, 209 
real eigenvalues, orthogonal eigenvectors, 
219-221 

Hilbert matrix, determinant, 175, 233, 535 
Hilbert space, 13, 534, 760, 795, 812 
Hilbert-Schmidt integral equation, 890-897 


variational analog, 957 
Hilbert transforms, 423 
Hubble’s law, 7 
Hydrogen atom, 726-728 
associated Laguerre equation, 727 
electrostatic potentials, 569, 697 
momentum representation, 816 
Hydrogen molecular ion, 466 
Hypergeometric equation, 748 
alternate forms, 750 
second independent solution, 750 
singularities, 454 
Hypergeometric functions, 748-752 
Chebyshev functions, 751 
Legendre functions, 750 


Ill-conditioned systems, 233, 535, 829 
Impulse function. See Delta function, Dirac 
Incomplete gamma function, 319, 339-341, 
565-571 

confluent hypergeometric representation, . 753 
recurrence relations, 569 
Indicial equation, 456 
Inertia, moment of, 217-219 
Infinite products, 346-350 
convergence, 347 
cosine, 347 

gamma function, 347, 541 
sine, 347 

Infinite series, 277-351 
algebra of series, 295-299 
double series, 297, 298 
alternating series, 293, 294 
Cauchy criterion, 277 
Chebyshev truncation, 744-746 
convergence 
absolute, 294 
conditional, 294 
improvement of, 288, 296, 334 
uniform, 299 

convergence, tests for, 280-293 
Abel’s, 302 
Cauchy integral, 283 
Cauchy ratio, 282 
Cauchy root, 281 
comparison, 280 
D’Alembert ratio, 282 
Gauss’, 287, 290 
Kummer’s, 285 
Maclaurin integral, 283 
Raabe’s, 286 
Weierstrass M, 301 
functions, series of, 299-303 
geometric series, 278 
harmonic series, 279 
Leibnitz criterion, 293 
partial sums, 277, 340 
power series, 313-321 
Riemann’s theorem, 295 
telescoping, 744—746 
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Integral equations, 865-924 
delta function, Dirac, 905, 909 
differential equation-integral equation transfor- 
mation, 869, 901-904 
Fredholm equation, 865 
Hilbert-Schmidt theory, 890-897 
integral transforms, 874, 875 
Neumann series, 879-882, 915 
nonhomogeneous integral equation, 894 
numerical solution, 885-887 
orthogonal eigenfunctions, 891 
separable kernel, 882-885 
solution by generating function, 876 
Volterra equation, 865 

Integral transforms, 794—864. See also Fourier 
transform; Hankel transform; Laplace 
transform; Mellin transform 
Fourier, 794-823 
Hankel, 795-797 
Laplace, 795, 824-863 
Mellin, 795,797 
Integrals 

contour, 365, 919 
differentiation of, 478 
evaluation by 
beta functions, 560-565 
contour integration, 403-421 
Lebesgue, 524 
Riemann, 365, 511 
Stieltjes, 484 
Integration, vector, 51-57 
line integrals, 51 
surface integrals, 53 
volume integrals, 54 
Interpolating polynomial, 190, 968 
Inverse operator, uniqueness of, 826 
Inversion of power series, 316 
Irreducible 
groups, 240 
tensors, 134 
Isomorphic, 4, 184, 239 


Jacobi-Anger expansion, 585 
Jacobi identity, 185 
Jacobian, 37, 89 
Jordan’s lemma, 408 


Kernels, of integral equations of form k( x — 
t\ 875,877 
separable, 882-886 
Kirchhoff diffraction theory, 879, 922 
Kronecker delta, 11 
mixed second-rank tensor, 122 
Kronig-Kramers dispersion relations, 421, 424 
Kummer’s equation. See Confluent hyper- 
geometric equation 
Kummer’s first formula, 754 


Lagraftgian, 939 
Lagrangian multipliers, 945-950 
Laguerre equation, 721 
associated Laguerre equation, 116, 500, 725 
self-adjoint form, 500 
self-adjoint form, 500 
singularities, 454 
Laguerre functions, 721-731 
associated Laguerre polynomials, 725-728 
orthogonality, 726 
recurrence relations, 725 
Rodrigues’ representation, 726 
confluent hypergeometric representation, 755 
generating function, 723 
Gram-Schmidt construction, 522 
integral representation, 722 
Laplace transform, 847 
orthogonality, 724 
recurrence relations, 724 
Lame’s constants, 147 
Laplace equation, 48, 79, 437, 480 
minimum energy, 943 
solutions, 450, 451 
uniqueness of, 79 
Laplace transform, 795, 824—863 
convolution theorem, 849-853, 860, 875 
derivative of transform, 842, 843 
integratioin of transforms, 844, 845 
inverse transformation, 826-830, 853-861 
solution of integral equation, 875 
substitution, 838 
table of operations, 862 
of transforms, 863 
transform of derivatives, 831-838 
translation, 840 
Laplacian 
scalar, 48, 60 
coordinates, cartesian, 48 
circular cylindrical, 97 
curvilinear, 92 
spherical polar, 104 
tensor, 165 
vector, 49 

coordinates, cartesian, 49 
circular cylindrical, 97 
spherical polar, 105 
Laurent expansion, 308-383, 573, 761 
Legendre duplication formula, 556, 561, 623 
Legendre equation, 448, 647 
associated Legendre equation, 116, 666 
self-adjoint form, 500 
convergence of series solution, 291, 464 
self-adjoint form, 500 
singularities, 454 
Legendre functions, 637-711 
associated Legendre functions, 106, 666-680 
orthogonality, 670-672 
parity, 669 

recurrence relations, 668 
relation between + M and — M, 668, 677 
electric multipoles, 641-644, 651, 676 
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Fourier transform, 807 
generating function, 637, 876 
Gram-Schmidt construction, 519 
hypergeometric representations, 750 
Legendre differential equation, 647 
Legendre polynomials, 637 
Legendre series, 654 
orthogonality, 652-663 
parity, 649 

polarization of dielectric, 661 
recurrence relations, 645, 707 
ring of electric charge, 658 
Rodrigues’ formula, 663, 670, 691 
Schlaefli integral, 664 
second kind, 701-707 
closed form solutions, 704-706 
series solution of Legendre equation, 464, 
701-703 

shifted Legendre functions, 522 
sphere in uniform electric field, 656 
spherical harmonics. See Spherical harmonics 
Legendre polynomial addition theorem 
derivation from Green’s function, 913 
group theory, 261 

two spherical polar coordinate systems, 
693-698 

Leibnitz formula, for 
differentiating an integral, 478 
differentiating a product, 667, 670 
n, 318, 763, 111 
Lerch’s theorem, 826 
Levi-Civita symbol, 132, 133, 168 
Lie groups, 252 
generators, 252, 273 
L’ Hospital’s rule, 310, 567, 596 
Linear independence, 5, 468 
Linear operator, 113, 184, 188 
differential operator, 113 
integral transforms, 795 , 824 
Liouville’s theorem, 375, 399 
Liquid drop model, 679, 949 
Logarithmic integral function, 567 
Lommel integrals, 594 
Lorentz covariance of Maxwell’s equations, 
150-164 

Lorentz relation, 151 
Lorentz transformation, 154, 273 


Maclaurin series, 40, 551 
Maclaurin theorem, 305 
Madelung constant, 298 
Magnetic dipole, 47, 110, 676 
Magnetic field of current loop, 325, 672-676, 
806, 846 

Magnetic vector potential. See Potential theory, 
vector potential 

Mapping, conformal. See Conformal mapping 
Matrices, 176-237 
adjoint, 210 

angular momentum matrices, 186, 187, 262 


anticommuting sets, 211-213 
antihermitian, 221 
definition, 177 
direct product, 179 
diagonalization, 217-229 
Dirac, 211-213 
Euler angle rotation, 198-200 
Hermitian, 210, 219 
unitary relation, 215, 236, 261 
ill-conditioned, 233 
inverse, 181, 196 

Gauss-Jordan matrix inversion, 182 
ladder operators, 187 
matrix multiplication, 178 
moment of inertia, 217 
normal, 229-231 
orthogonal, 191-205 
Pauli spin, 186, 211 
quaternions, 185 
relation to tensors, 203 
similarity transformation, 201-203 
trace, 181, 188 
transpose, 196 
unitary, 210 

vector transformation law, 194 
Maxwell’s equations 
derivation of wave equation, 49 
dual transformation, 157 
Gauss’ law relation, 77 
Lagrangian for, 945 
Lorentz covariance, 150-158 
Mellin transforms, 795, 797 
Metric, 87, 127, 158, 162, 208 
Minkowski space, 134, 152, 272 
Mixed tensor, 120 
Modified Bessel functions, 610-616 
asymptotic expansion, 618, 619 
Fourier transform, 806 
generating function, 613 
I v , K v , 610, 612 
integral representation, 613-616 
Laplace transform, 846 
recurrence relations, 611, 614 
series form, 611 
Wronskian relation, 615 
Momentum representation, Schrodinger wave 
equation, 814-820,867-869 
Morera’s theorem, 373, 374 


Navier-Stokes equation, 50, 98, 100 
Neumann functions, 596-604 
asymptotic form, 619 
Fourier transform, 806 
recurrence relations, 599 
series form, 474, 597 
spherical Neumann functions, 623 
Wronskian formulas, 599, 602 
Neutron diffusion theory, 319, 810 
Boltzmann transport equation, 866 
Newton’s root finding formula, 310, 963 
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Normal matrices, 229-231 
Normal modes of vibration, 231-233 
Numerical analysis 
asymptotic series, 339-346 
Bessel, modified Bessel functions, 620, 621 
cosine, sine integrals, 342, 343 
exponential integral, 339-341 
Gauss error function, 345 
Stirling’s series, 344 

Chebyshev truncation, telescoping, 744-746 
computation 

Bessel functions, 577, 621 
Chebyshev polynomials, 734 
factorial functions, 556-558 
Hermite polynomials, 713 
Laguerre polynomials, 724 
Legendre polynomials, 647 
spherical Bessel functions, 628 
convergence of series, improvement of, 288, 
296, 334 

differential equations, 491-496 
first-order, 491 

predictor-corrector methods, 493 
Runge-Kutta method, 492 
second-order, 494 
factorial function, 557, 558 
integral equations, 885-887 
inverse Laplace transform, 829 
Rayleigh-Ritz variational technique, 957-961 
Nutation, earth’s, 833 


Oblique coordinates, 164, 206-209, 563 
Olber’s paradox, 291 

Operators. See also Angular momentum operator 
adjoint, 498 
del, 33 

integral. See Integral transforms 
ladder (raising, lowering) 

Hermite functions, 716 
matrices, 187 

spherical harmonics, 687-691 
linear differential , 113, 497 

Optical dispersion, 424, 857-859 
Orthogonal eigenfunctions 
Hilbert-Schmidt integral equations, 891 
Sturm-Liouville differential equations, 511 
Orthogonal polynomials, 520 
Orthogonality 

curvilinear coordinates, 87 
functions, 511 
vectors, 15 

Orthogonality condition, 11, 194, 197 
Orthogonalization, Gram-Schmidt method, 
516-523 

Oscillator, linear 
damping 

driving force included, Laplace transform 
solution, 851 

Laplace transform solution, 838, 845 
Green’s function, 904 


integral equations for, 870, 904 
Laplace transform solutions, 832 
momentum wave function, 818 
quantum mechanical development, 715, 716 
scalar potential, 68 
self-adjoint equation, 500 
series solution of equation, 455 
singularities in equation, 454 


Parity, 107 
Bessel functions, 585 
Chebyshev functions, 735 
differential operator, 459 
Fourier cosine, sine transforms, 801 
Hermite functions, 713 
Legendre functions, 649 
associated, 669 
second kind, 706 
spherical harmonics, 684 
spherical modified Bessel functions, 633 
spherical polar coordinates, 107 
vector spherical harmonics, 709 
Parseval’s identity, 780 
Parseval’s relation, 425, 812 
Partial differential equations, 437 
boundary conditions, 502 
Partial fractions, 827 
Particle, quantum mechanical 
Lagrangian multipliers, 947 
in rectangular box, 117, 947 
in right circular cylinder, 587, 948 
in sphere, 628, 634, 960 
Pauli spin matrices, 186, 211 
special unitary group, SU(2), 265-267 
Pi (ir) 

Leibnitz formula, 318, 763, 111 
Wallis formula, 348, 565 
Pochhammer symbol, 533, 749, 753 
Poisson equation, 77, 437, 813 
Green’s function, 480, 485, 897 
Poisson’s ratio, 146, 149 
Polar vectors, 128 
Potential theory, 64—74 
conservative force, 65 
scalar potential, 64, 80 
electrostatic, 151, 592, 596, 614, 656-659, 
830, 846 

gravitational, 67, 655, 683 
vector potential , 47 , 69 , 80 , 105 , 1 10 , 15 1 , 325 , 
678 

current loop, 105, 325, 672-676, 708-710 
Power series, 313-321 
differentiation, integration, 314, 315 
inversion, 3*16 

solution of differential equations, 454-467, 
473 

uniqueness theorem, 315,320 
Principal axes, 219 
Projection operators, 518, 538, 654 
Pseudoscalar, 131 
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Pseudotensor, 128-137 
definition of, 131 
Pseudovector, 131 


Quadrature 
Gaussian, 968-974 
interpolatory formulas, 968 
Simpson’s rule, 969 
Quantum mechanics 

angular momentum. See Angular momentum 
operator deuteron, 500-502 
expectation values, 506, 815 
hydrogen atom 

associated Laguerre polynomials, 726-728 
momentum representation, 816, 817 
hydrogen molecular ion, 466 
momentum representation, 814-820, 867 
particle. See Particle, quantum mechanical 
scattering, 409-411, 660, 915-920 
Schrodinger representation, 817 
Schrodinger wave equation, 954 
sum rules, 427 
wave packet, 819 
Quaternions, 10, 20, 185 
Quotient rule, 126, 127, 135 


Radioactive decay, 837 
Rayleigh equation, 665, 922 
Rayleigh formulas, 628 
Rayleigh-Ritz variational method, 957-961 
Reciprocity principle, Green’s functions, 488, 
901 

Recurrence relations 
Bessel functions, 576 
spherical Bessel functions, 627 
Chebyshev functions, 732 
confluent hypergeometric functions, 757 
exponential integral function, 570 
factorial functions, 544 
gamma functions, 539 
Hankel functions , 605 

Hermite functions , 712 

hypergeometric functions, 750 
incomplete gamma function, 569 
Laguerre functions, 724 

associated Laguerre functions, 725 
Legendre functions, 645 

associated Legendre functions, 668 
second kind, 705 
modified Bessel functions, 611, 614 

spherical modified Bessel functions, 634 
Neumann functions, 599 
polygamma functions, 553 
Relativistic particle, Lagrangian, 941 
Residues 

Bromwich integral, 853-861 
calculus of residues, 396-421 
residue theorem, 400 
Riemann-Christoffel curvature tensor, 123 


Riemann zeta function, 284, 289, 332, 333, 550 
Fourier series evaluation, 772, 773, 775 
table of values, 332 
Rodrigues representation 
Chebyshev polynomials, -735,738 
Hermite polynomials, 713 
Laguerre polynomials, 723 
associated Laguerre polynomials, 726, 728 
Legendre polynomials, 663 

associated Legendre polynomials, 681 
Rotation 

angular momentum and, 261-264 
of coordinates, 8-12, 119, 120, 191-203, 
261-264 

of functions, 264 
of vectors, 201, 202 
Runge-Kutta solution, 492 


Saddle point. See Steepest descent, method of 
Scalar, definition of, 1, 9, 16, 119 
Scalar potential, 64, 80 
Scalar product of vectors, 13-18, 511 
Scattering, quantum mechanical. Green’s function, 
Schmidt orthogonalization. See Gram-Schmidt 
orthogonalization 
Schrodinger wave equation 
hydrogen atom, 726 
momentum representation, 820, 867 
particle in a sphere, 928 
scattering, 915-920 
variational approach, 954 
Schwarz inequality, 527, 533 
generalized, 536 

Schwarz reflection principle, 377, 378, 553 
Secular equation, 221, 884 
Self-adjoint differential equations, 497-509 
Self-adjoint differential operator. See Hermitian 
differential operator 

Semiconvergent series. See Asymptotic series 
Separation of variables, 111-117, 440, 448-451 
Series solution of differential equations , 45 1-467 
Bessel’s equation, 459 
Chebyshev series, range of convergence, 292 
Hermite ’s equation, 464 
hypergeometric series, range of convergence, 

291 

incomplete beta function, 292 
Legendre’s equation, 464, 701-704 
range of convergence, 288, 291 
recurrence relation, 456 
ultraspherical equation, range of convergence, 

292 

Shifted polynomials 
Chebyshev, 746, 747 
Legendre, 522 

Sine x, infinite product representation, 348 
Sine integral, 567 
asymptotic representation, 342, 343 
confluent hypergeometric representation, 756 
Laplace transform, 847 
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Singularity, 396-400 
branch point, 397 
differential equation, 451 — 454, 461 
Laurent series, 396 
on contour of integration , 408-41 1 
pole, 396 

Special unitary group, SU(2), 253, 267 
0 3 + homomorphism, 255-258 
Pauli spin matrices, 265-267 
Special unitary group, SU(3), 269 
Spherical Bessel functions. See Bessel functions 
Spherical harmonics, 680-685 
addition theorem, 261, 693-698 
Condon-Shortley phase, 682, 692 
harmonics; sectoral, tesseral, zonal, 685 
tensor spherical, 140, 710 
vector spherical, 707-711 
integrals, 698-700 
ladder operators, 687-691 
Laplace series, 682, 685 
orthogonality, 681 
Spherical polar coordinates , 102-1 1 1 

Spherical tensor, 135 
Spinors, 123, 214 
Stark effect, 465 

Steepest descent, method of, 428-436 
factorial functions, 433 
Hankel functions, 431 
modified Bessel functions, 435 
Step function, 415, 484, 490, 804, 828, 840, 844 
Stirling’s series, 434, 555-559 
Stokes’ theorem, 61-64, 92 
application to Cauchy integral theorem, 
366-368 

Stress-strain tensors, 140-145 
Sturm-Liouville theory, 497-538, 652, 762, 903 
variational analog, 957 
Summation convention, 121, 125 
Symmetry 

differential operators, 459 
dispersion relations, 423 
dyadics, 138 
functions, 458 
Green’s function, 486, 901 
kernels, 890 
matrices, 201 
tensors, 122 


Taylor expansion, 43, 303-313, 376, 377, 491, 
767 

more than one variable, 309 
Tensor analysis, 118-167 
contravariant vector, 119 
covariant vector, 119 
definition of second rank tensor, 120 
differential operations, 164-167 
isotropic tensor, 122, 123, 136 
noncartesian tensors, 158-164 
scalar quantity, 119 
symmetry-antisymmetry, 122 


tensor transformation law, 120 
Tensor density. See Pseudotensor 
Thermodynamics, exact differentials, 69 
Thomas precession, 275 
Titchmarsh theorem, 426 
Transfer functions, 820-823 
Triple scalar product of vectors, 26-28 
Triple vector product of vectors, 28-30 
BAC-CAB rule, 29, 45, 49, 50 
Tschebycheff. See Chebyshev 


Ultraspherical equation, 735 
polynomials, 643, 731 
self-adjoint form, 500 

Uncertainty principle in quantum theory, 629, 
716, 803 
Uniqueness 

descending power series, 320, 675 
differential equation solution, 463 
inverse operator, 826 
Laurent expansion, 384 
power series, 315, 320, 456 
solutions of Laplace’s equation, 79 
Unit vectors 

coordinates, cartesian, 5 
circular cylindrical, 96 
spherical polar, 103 


Variational principles. See Calculus of variations 
Vector analysis, 1-84. See also Tensor analysis 
components, 4 
normal vectors, 15 
orthogonal vectors, 15 
reciprocal lattice, 28, 32, 207 
rotation of coordinates, 8, 193 
scalars, 1, 16 
triangle law of addition, 1 
vector, definitions of, 1, 7-13 
vector components, 4 
vector transformation law, 10, 119, 194 
Vector Laplacian. See Laplacian, vector 
Vector potential, 47, 69, 110, 325, 672 
Vector product of vectors, 18-26 
Vector space, 12, 530-534 
Vector spherical harmonics, 707-711 
Vierergruppe, 185, 239, 240, 242, 243 
Volterra integral equation, 865 . See also Integral 
equations 


Wallis formula for it, 348, 565 
Wave equation, 

anomalous dispersion, 857-859 
derivation from Maxwell’s equations, 49 
Fourier transform solution, 808, 809 
Laplace transform solution, 841, 842 
Waveguide, coaxial, 101, 600, 603 
Whittaker functions, 756 
Work, potential, 66 
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Wronskian solutions of self-adjoint differential equation, 

absence of third solution, 477 469-471,507 

Bessel functions, 599, 602, 605, 608, 622 
spherical, 631, 634 

Chebyshev functions, 738 Young’s modulus, 146, 149 

confluent hypergeometric functions, 757 
Green’s function, construction of, 900 

linear independence of functions, 468 Zeta function. See Riemann zeta function 

second solution of differential equation, 469, Zeros, of functions, 636, 652, 963-967 
507 of Bessel functions, 581 
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